This Page Is Inserted by IFW Operations 
and is not a part of the Official Record 

BEST AVAILABLE IMAGES 



Defective images within this document are accurate representations of 
the original documents submitted by the applicant. 

Defects in the images may include (but are not limited to): 



BLACK BORDERS 

TEXT CUT OFF AT TOP, BOTTOM OR SIDES 
FADED TEXT 
ILLEGIBLE TEXT 
SKEWED/SLANTED IMAGES 
COLORED PHOTOS 

BLACK OR VERY BLACK AND WHITE DARK PHOTOS 
GRAY SCALE DOCUMENTS 



IMAGES ARE BEST AVAILABLE COPY. 



As rescanning documents will not correct images, 
please do not report the images to the 
Image Problem Mailbox. 



Applicant: Andre Lieber et al. 
U.S. Serial No: 09/980,564 
Filed: November 30, 2001 
Page: 18 

REMARKS 

Claims 121-145 are pending and being examined. Claims 121, 122 and 129-131 are 
amended herein. 

Amendment of the claims and specification do not involve new matter and are supported 
by the specification as originally filed. Entry of these amendments is respectfully 
requested. 

Support for the amended specification at page 10, lines 14-21 can be found in the 
originally-filed specification of the subject application at page 25, lines 9-19; and Figure 
17. Also this amendment includes correction of a typographical error. 

Support for the amended specification at page 10, lines 23-31 and continuing at page 11, 
lines 1-3 can be found in the originally-filed specification of the subject application at 
Figure 18. 

Support for the amended specification at page 11, lines 5-28 can be found in the 
originally-filed specification of the subject application at page 87, lines 4-25; and Figure 
19. 

Support for the amended specification at page 11, lines 30-31 and continuing at page 12, 
lines 1-6 can be found in the originally-filed specification of the subject application at 
page 87, lines 27-31 and continuing at page 88, lines 1-13; and Figure 20. 

The amendment of the specification at page 12, lines 14-24 is merely to correct a 
typographical error. 
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The amendment of the specification at page 12, lines 26-31 and continuing at page 13, 
lines 1-18 is merely to correct a typographical error. 

Support for amended claims 121 and 122 can be found in the originally-filed 
specification of the subject application at page 21, lines 24-29; page 22, lines 11-17; page 
22, lines 26-31; page 23, lines 1-2 and lines 11-16 and lines 19-30; page 24, lines 1-9; 
and Figure 17 A. 

The amendment to claims 129-130 are merely to add sequence listing indicators. Support 
for amended claims 129-130 can be found in the originally-filed specification of the 
subject application at page 97, lines 2-27. 

The amendment to claims 129-130 are merely to add sequence listing indicators. Support 
for amended claim 131 can be found in the originally-filed specification of the subject 
application at page 14, line 27; page 96, lines 1-3; and Figure 27. 

Oath/Declaration : 

In the Office Action, the Patent Office deemed the Declaration previously submitted was 
defective because non-initialed and/or non-dated alterations have been made. 

Applicants are currently in the process of executing the amended Combined Declarations 
and Power of Attorney. Executed documents will be forwarded to the Patent Office once 
they are availble. 
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Drawings: 

In the Office Action, the Patent Office objects to the drawings because Figure 17 contains 
parts A and B. In response, Applicants have amended the Brief Description Figure 17 to 
include parts A and B. 

The Brief Description of Figure 18 describes Ad5/ll which does not appear in the figure. 
In response, Applicants have amended the Brief Description of 18 to delete "Ad5/1 1". 

Also, Figures 19-20 contain parts C and D which are not described in the Brief 
Description of the Drawings. In response, Applicants have amended the Brief 
Description Figures 19 and 20 to include parts A and B. 

Figures 22 and 23 refer to Figure 8 text for further descriptions of the details, but Figure 8 
does not correspond to the details indicated in Figures 22 and 23. In response, Applicants 
have amended the specification to delete the references to Figure 8. 

Figure 29 does not contain parts A-D which are described in the Brief Description of the 
Drawings. In response, Applicants submit herein a corrected drawing of Figure 29 
showing parts A-D. 

Additionally, Applicants submit herein a complete set of formal drawings of Figures 1-29 
(EXHIBIT 1). The formal drawings do not contain new matter. Applicants respectfully 
request their entry. 

Sequence Compliance: 



In the Office Action, the Patent Office requires Applicants to submit a substitute 
sequence listing, computer readable form, and a statement, because Figure 27 and claims 
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129-131 contain sequences which are not included in a previously submitted sequence 
listing. 

In response, Applicants submit herein a substitute sequence listing in paper form 
(EXHIBIT 2) and computer readable form, and a Declaration under 37 C.F.R. §1.82 1(f) 
(EXHIBIT 3). 

Applicants' Invention: 

Applicants teach recombinant, double-stranded, hybrid Ad-AAV vectors, comprising a 
parallel and an anti-parallel DNA strand, where the parallel strand comprises the 
following elements: a left and right Ad.ITR sequence; an Ad packaging sequence; a first 
and second AAV.ITR; a first and second IR sequence; a heterologous promoter; a foreign 
gene sequence; and a gene sequence that mediates replication of the adenovirus in a 
transduced cell. The anti-parallel strand of the claimed vectors comprises a nucleotide 
sequence encoding a modified adenovirus fiber protein which alters the tropism of the 
adenovirus vector. 

REJECTION UNDER 35 V.S.C. $112, FIRST PARAGRAPH: 
Rejection of Claim 131: 

In the Office Action, the Patent Office issued a new matter rejection. The Patent Office 
alleges claim 131 contains subject matter which is not described in the specification in 
such a way as to reasonably convey to one skilled in the art that the inventor(s), at the 
time the application was filed, had possession of the claimed invention. 

The Patent Office alleges the peptide ligands having amino acid sequences LNFCSFC 
and LNGCGXXXXXXXXXXGC are not supported in the specification at page 14, line 
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27, and Figure 27. Further, the Patent Office alleges the specification does not teach a 
use of these peptide ligands. 

Applicants respectfully disagree. The specification teaches various embodiments of 
heterologous peptides replacing a GH or HI loop region in order to retarget the gutless 
vector to a desired cell type. Support can be found in the originally-filed application at 
page 22, lines 14-17; and page 27, lines 16-24. 

Applicants also note that the amino acid sequence in claim 131(c) contains a 
typographical error. Applicants now amend claim 131(c) to recite the sequence 
LNGCGSGC. Support for amended claim 131c can be found in the originally-filed 
specification at page 14, line 27, and Figure 27 (please see "adeno 5 GH cys 2"). Further 
support can be found page 27, lines 16-24. 

Applicants contend that the amino acid sequence in claim 13 Id, namely 
LNGCGXXXXXXXXXXGC, is indeed supported in the originally-filed specification at 
page 14, line 27 and at Figure 27. In particular, Figure 27 shows a GH peptide having 
this sequence (please see "adeno 5 GH peptides"). Further support can be found page 27, 
lines 12-20. 

Rejection of Claims 121-145; 

Inverted Repeat Sequences: 

In the Office Action, the Patent Office rejects claims 121-145 under 35 U.S.C. 112, first 
paragraph, because the specification supports the claimed adenoviral vector having pairs 
of Ad ITR's and AAV ITRs, but the specification allegedly does not support a third pair 
of inverted repeat sequences. 
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In response, Applicants respectfully disagree. Applicants contend that the originally-filed 
claims and specification provide support for an adenoviral vector having the third pair of 
inverted repeat (IR) sequences. 

For example, pending claims 121 and 122 contain the same subject matter of originally- 
filed claims 16, 26, 27, 28 and 33. 

Claim 16 recites a double-stranded Ad vector including a first strand (i.e., parallel strand) 
comprising: a left and right Ad-ITR; an Ad packaging sequence; a transgene cassette; and 
at least one Ad sequence which directs adenoviral replication. The second strand (i.e., 
anti-parallel strand) encodes an adenoviral fiber protein that permits targeting of the Ad 
vector into the host cell of interest. 

In an embodiment of claim 16, originally-filed claims 26 and 27 recite the transgene 
cassette comprises: a left and right ITR; a polyadenylation sequence; a transgene 
sequence; and a promoter sequence. 

In a further embodiment of the left and right ITR of claims 26 and 27, originally-filed 
claim 28 recites adeno viral-associated ITR's. 

And in a further embodiment of claims 26 and 27, originally-filed claim 33 recites the 
transgene cassette further comprising an inverted repeat sequence. 

Thus, in pending claims 121 and 122, support for inverted repeat sequences can be found 
in originally-filed claims 16, 26, 27, 28 and 33. 

In the originally-filed specification, support for the inverted repeat sequences can be 
found at: Figure 17 A; the Brief Description of Figures at page 10, lines 14-21; and at 
Example II G at page 86, lines 10-20. 
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Source of the Inverted Repeat Sequences: 

In the Office Action, the Patent Office also alleges Applicants provide no guidance for 
sequences that are meant to constitute the inverted repeat sequences in claim 121 and 
122, steps (d) and (g). 

Applicants respectfully disagree. Applicants contend the originally-filed specification 
discloses the inverted repeat sequences which "mediate predictable genomic 
rearrangements resulting in a gutless vector genome devoid of all viral genes" (page 23, 
lines 19-21). 

Further, the originally-filed specification discloses the inverted repeat sequence is a 1.2kb 
inverted homology element (page 10, lines 16-17). 

One embodiment of the inverted repeat sequence is the 1.2kb fragment from a chicken 
globin HS-4 element. Support for this embodiment can be found in Steinwaerder, et al., 
1999, J. Virology 73:9303-9313, which is incorporated by reference (please see the 
originally-filed specification at page 32, lines 23-24) (a copy of Steinwaerder is attached 
as EXHIBIT 4). 

Steinwaerder, et al., discloses producing gutless adenoviral vectors comprising the 1.2kb 
chicken globin HS-4 element which includes inverted repeat sequences (see 
Steinwaerder, et al, at: Abstract, lines 4-6; also page 9304, right column, lines 5-9 under 
"Results"). Steinwaerder, et al., reports that the chicken globin HS-4 element mediates 
predictable genomic rearrangements when inserted into the El region of an adenoviral 
vector. 
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Additionally, the nucleotide sequence of the HS-4 element was known to those skilled in 
the art at Genbank, accession no. U78775 (submitted November 18, 1996 to National 
Institute of Health; see also Steinwaerder, et al., at: page 9304, left column, lines 1-4 
after "Materials and Methods"). 

Applicants also contend it was well known to the skilled artisan that other inverted repeat 
sequences mediate DNA recombination processes, including: bacterial transposons Tn5 
and TnlO (EXHIBIT 5 is Gordenin, et al, September 1993, Molecular and Cellular 
Biology 13:5315-5322); sequences contained in the left and right terminal ends of Ad5 
genome (EXHIBIT 6 is Chartier, et al., July 1996 Journal of Virology 70: 4805-4810); 
yeast ARM's or DNA at-risk-motifs (EXHIBIT 7 is Gordenin and Resnick May 1998 
Mutat. Res. 400:45-58); transposable elements from insects (EXHIBIT 8 is O'Brochta 
and Atkinson, Sept-Oct 1996 Insect Biochem Mol Biol 26:739-753); and direct and 
inverted tandem repeat sequences from human (EXHIBIT 9 is Lambert, et al., Apr 1999 
Mutat. Res. 433:159-168). 

The Modified Fiber Protein: 

In the Office Action, the Patent Office states that the specification does not teach how to 
modify the second strand to encode a modified fiber (Office Action, page 6, line 7-8; 
page 6 line 22- page 7, linel). 

Applicants respectfully disagree. The specification discloses many embodiments of 
modified adenoviral fiber proteins. 

For example, the specification discloses heterologous peptide sequences comprising an 
Ad5 fiber tail domain and an Ad35 fiber shaft and knob domain (page 18, lines 9-16), or 
an Ad5 fiber tail domain and Adll fiber shaft domain (page 26, lines 27-30), or fiber 
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proteins from Ad 3, 7, 9, 11, and 35 (page 26, lines 15-20). The resulting Ad5/35 and 
Ad5/1 1 fiber proteins showed broader tropism (See Table I, below). 

Applicants also provide an extensive list of various adenoviral serotypes which can be 
used by the skilled artisan to retarget the host cell of interest (see Appendix I in the 
specification as originally filed at pages 107-1 13). 

Applicants teach how to make a hybrid Ad-AAV vector comprising a nucleotide 
sequence encoding a modified adenoviral fiber protein. Support can be found in the 
originally-filed specification at Example n, Section G, at page 83, lines 14-31; page 84, 
lines 1-4; page 85, lines 15-20; and Figures 16A and 18. 

Other examples include replacing the G-H or H-I loop with heterologous peptide 
sequences such as the RI or RII protein from malaria circumsporozoite surface protein 
(page 27, lines 28-30; page 28, lines 6-12; and Example II J at page 96, line 31 and 
continuing at page 97, lines 1-27) 

In other embodiments, the G-H or H-I loop is replaced the two peptides such as the RGD- 
4 peptide and a peptide that binds matrix metalloproteinase (page 98, lines 1-10). 

Yet other embodiments include heterologous peptides depicted in Figure 27. 

Support for Ad.AAV hybrid vectors comprising nucleotide sequences encoding modified 
fiber proteins that alter the tropism of the vector can be found in the originally-filed 
specification in the Figures, as shown in Table 1 below. 

In particular, the data in Figure 25 shows virions encoded by the Ad5L vector attach, 
internalize and transduce 293 cells with high efficiency but to K562 cells with low 
efficiency. The virions encoded by the Ad5/35L and Ad5/35S vectors attach, internalize 
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and transduce both 293 and K562 cells with high efficiency. With this data Applicants 
teach how to make and use an Ad.AAV vector encoding a modified fiber protein that 
alters tropism of the encoded virion. 



TABLE I 



Figures: 


Depicts: 


Ad vectors/Cells: 


Figure 12 


Tropism (e.g., attachment and internalization) of 
non-modified Ad virions to cells 


Ad 3, 4, 5, 9, 35 and 41 

CHO (ovary) 
HeLa (cervical) 

CD34+ (bone marrow) 


Figure 13 


Tropism (e.g., attachment and internalization) of 
non-modified Ad virions to cells 


Ad 3, 5, 9, 35 and 41 

v^nu {yvaiyj 

HeLa (cervical) 
293 (kidney) 


himifp 1 A 


iropioin ^cg., aiiacnrnent ana lnicrnaiizauonj ox 
non-modified Ad virions to cells 


AH 7^0^^ on/1 A\ 

i\q j, j 9 y, j j ana hi 
K-562 (hematopoietic) 

j M \ *\ A -U /r\r\n£» m q tt/-\\i r\ 
vL'jH T ^DUIIC ITlalTOW ) 


Figure 16 


Schematic showing vector construct of a serotype 
hybrid Ad construct having Ad5 tail and Ad35 
knob and short shaft: 
"Ad5-GFP-F35" 

see also page 9, lines 21-31 and page 10, lines 1- 
12 


Ad5/Ad35 hybrid 


Figure 18 


Schematic showing vector construct of a serotype 
hybrid Ad construct having Ad5 tail and Ad35 
knob and short shaft: 
"Ad5-GFP-F35" 


Ad5/Ad35 hybrid 


Figure 19 


Cross competition data showing tropism (e.g., 
attachment and internalization) of 
Ad5-GFP-F35 virions to K-562 cells 


Ad5/Ad35 hybrid 
K-562 (hematopoietic) 


Figure 20 


Cross competition data showing tropism (e.g., 
attachment and internalization) of 
Ad5-GFP-F35 virions to K-562 cells 


Ad5/Ad35 hybrid 
K-562 (hematopoietic) 


Figure 21 


Graph showing transduction (gene transfer) of 
cells with Ad 5-GFP-F35 vector 


Ad5/Ad35 hybrid 

CD34+ (bone marrow) 
K-562 (hematopoietic) 
HeLa (cervical) 
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Figures: 


Depicts: 


Ad vectors/Cells: 


Figure 22 


FACS analysis of CD34+ cells transduced with 
Ad5-GFP-F35, and then sorted for transduced 
cells that express GFP (from transduction) 


Ad5/Ad35 

CD34+ (bone marrow) 


Figure 23 


FACS analysis of CD34+ cells transduced with 
Ad5-GFP-F35, and then sorted for transduced 
cells that exnress GFP (from transduction) and 
also express or CD1 17+ (a.k.a: c-kit) 


Ad5/Ad35 

CD34 and CD1 17 (precursor 
bone marrow cells, e.g., stem 
cells) 


Figure 25 


Doner" schematic denictine hvbrid Ad virions 
having Ad5 knob with stem and tail of Ad9 or 
Ad35 

Lower: table showing tropism data (e.g., 
attachment, internalization, and transduction 
data) of hybrid virions 


Ad5/Ad9 
Ad5/Ad35 

293 (kidney, CAR+) 
K-562 (hematopoietic, little 
CAR) 

Y79 (CAR+) 


Figure 27 


Schematic depicting Ad5, wild-type sequence of 
GH loop, and also showing 4 variants (mutant 
sequences) or GH loop 


Sequences of mutant GH loop 


Figure 28 


Tropism (e.g., attachment and internalization) of 
non-modified Ad virions to cells 


Ad 3, 5, 9, 35 and 41 

REVC (vascular endothelial) 
JAWSH (dendritic) 
MCF7 (breast cancer) 
8714 (T lymphocyte) 



REJECTION UNDER 35 U.S.C. SI 12. SECOND PARAGRAPH: 



In the Office Action, the Patent Office rejects claims 121-145 under 35 U.S.C. §112, 
second paragraph because the terms "first strand" and "second strand" are unclear. In 
particular, the Patent Office states that it is not clear whether the "first and second 
strands" correspond to the r and 1 strands of the double-stranded adenovirus or to the 
segments of DNA that are each double-stranded. 

In response, Applicants contend the terms "first strand" and "second strand" refer to the 
inventive adenoviral vector comprising a first single strand of DNA which also is termed 
a parallel strand, and a second single strand of DNA strand which is anti-parallel to the 
first DNA strand. 
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Applicants have amended claims 121 and 122 to recite parallel and anti-parallel strands. 
Support for this amendment can be found in the originally-filed specification at page 19, 
lines 24-30 and Figure 1 A which shows a double-stranded vector each having a 5' end. 

CONCLUSION 

If a telephone interview would be of assistance in advancing prosecution of the subject 
application, Applicants' undersigned attorney invites the Examiner to telephone her at the 
number provided below. 

The Patent Office is authorized to charge a one-month extension of time fee under 37 
C.F.R. §1.1 7(a)(1) of $55.00 to Deposit Account No. 50-0306. No fee, other than the fee 
for a one-month extension of time, is deemed necessary in connection with the filing of 
this Communication. However, if any additional fee is necessary, the Patent Office is 
authorized to charge any additional fee to Deposit Account No. 50-0306. 



Respectfully submitted, 




Sarah B. Adriano 

Registration No. 34,470 

Mandel & Adriano 

55 So. Lake Ave., Suite 710 

Pasadena, California 91101 

626/395-7801 

Customer No: 26,941 
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Direct or inverse repeated sequences are important functional features of prokaryotic and eukaryotic 
genomes. Considering the unique mechanism, involving single-stranded genomic intermediates, by which 
adenovirus (Ad) replicates its genome, we investigated whether repetitive homologous sequences inserted into 
El-deleted adenoviral vectors would affect replication of viral DNA. In these studies we found that inverted 
repeats (IRs) inserted into the El region could mediate predictable genomic rearrangements, resulting in 
vector genomes devoid of all viral genes. These genomes (termed AAd.IR) contained only the transgene cassette 
flanked on both sides by precisely duplicated IRs, Ad packaging signals, and Ad inverted terminal repeat 
sequences. Generation of AAd.IR genomes could also be achieved by coinfecting two viruses, each providing 
one inverse homology element. The formation of AAd.IR genomes required Ad DNA replication and appeared 
to involve recombination between the homologous inverted sequences. The formation of AAd.IR genomes did 
not depend on the sequence within or adjacent to the inverted repeat elements. The small AAd.IR vector 
genomes were efficiently packaged into functional Ad particles. All functions for AAd.IR replication and 
packaging were provided by the full-length genome amplified in the same cell. AAd.IR vectors were produced 
at a yield of ~10 4 particles per cell, which could be separated from virions with full-length genomes based on 
their lighter buoyant density. AAd.IR vectors infected cultured cells with the same efficiency as first-generation 
vectors; however, transgene expression was only transient due to the instability of deleted genomes within 
transduced cells. The finding that IRs present within Ad vector genomes can mediate precise genetic rear- 
rangements has important implications for the development of new vectors for gene therapy approaches. 



The starting point for the presented study was an observa- 
tion made with first-generation adenovirus (Ad) vectors that 
contained fragments of Ad5 DNA, specifically the va f (23) or 
the precursor to the terminal protein (pTP) (26) genes, in- 
serted into the El region. The presence of these sequences in 
addition to the corresponding endogenous gene resulted in the 
appearance of two viral bands with different buoyant density in 
CsCl gradients after ultracentrifugation of lysates from in- 
fected 293 cells. This phenomenon was interesting, considering 
the unique mechanism by which the adenovirus replicates and 
the functional potential of repetitive sequences to mediate 
genetic rearrangements. 

The genomes of Ad2 and Ad5 are double-stranded, linear 
DNA molecules, approximately 35 kb in length with an in- 
verted terminal repeat sequence (ITR) of 102 bp on each end. 
Numerous studies in cell-free systems and in infected cells 
have established that Ad DNA replication takes place in two 
steps (reviewed in references 2 and 37). In the first stage, DNA 
synthesis is initiated by pTP. pTP binds as a heterodimer with 
the Ad polymerase (Pol) to specific sites within the ITRs. Ad 
DNA replication begins at both ends of the linear genome, 
resulting in a daughter strand that is synthesized in the 5' to 3' 
direction, displacing the parental strand with the same polarity. 
Three nonexclusive mechanisms are proposed for the second 
step, the replication of the displaced parental strand, (i) Dis- 
placed single strands can form partial duplexes by base pairing 
of the ITRs on which a second round of DNA synthesis may be 
initiated (22, 36). (ii) When two oppositely moving displace- 
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ment forks meet, the two parental strands can no longer be 
held together and therefore separate, resulting in partially 
duplex and partially single-stranded molecules; the synthesis is 
then completed on the displaced parental strand (21). (iii) 
Displaced strands, with opposite polarity resulting from initi- 
ation at two different molecular ends, can renature to form a 
double-stranded daughter molecule (37). Elongation of DNA 
synthesis requires only DNA binding protein (DBP) and Pol. 
With 20 to 30 bp being synthesized per second, Ad elongation 
is relatively slow compared to that in the eukaryotic replication 
systems (which synthesize -500 bp/s). DBP may stabilize the 
formation of the panhandle structure and the interstrand re- 
naturation process (39). 

Repetitive sequences are a common feature of prokaryotic 
and eukaryotic genomes. Direct repeats (DR) and inverted 
repeats (IR) are associated with DNA recombination pro- 
cesses (5, 20, 29). Furthermore, it is thought that IR-induced 
DNA secondary structures cause pausing of replication by 
DNA polymerases and reverse transcriptases, resulting in ge- 
netic alterations (1, 7, 12, 13, 19, 38). 

The unique Ad replication strategy, involving single-strand- 
ed replication intermediates, prompted us to investigate in 
detail whether repetitive homologous sequences inserted into 
the Ad vector genome would affect replication of viral DNA or 
whether it would induce genomic rearrangements. In these 
studies, we have found that, as a result of the replication of 
El-deleted Ad vectors containing IR flanking a transgene cas- 
sette, a small viral genome is efficiently formed and packaged. 
These genomes were devoid of all Ad genes. Particles contain- 
ing this small genome could be separated from virions with 
full-length genomes by ultracentrifugation in CsCl gradients. 
In addition to having interesting virological aspects, this find- 
ing has practical importance for Ad vector development. 
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Ads have a number of properties that make them attractive 
vehicles for gene transfer. These include highly efficient mech- 
anisms of gene transfer to a large variety of cell types in vivo 
and the easy production of purified virus at high titers. Highly 
efficient transduction is mediated by the capsid and core pro- 
teins involved in cell attachment and by internalization, endo- 
somal lysis, and nuclear import. Most Ad vectors used for in 
vivo gene transfer are deleted for El genes. The major limita- 
tion associated with these El-deleted vectors has been in short- 
term expression in vivo, due to the development of immune 
responses to expressed viral proteins which result in toxicity 
and viral clearance. In order to overcome some of these prob- 
lems, Ad vectors have been developed from which almost the 
entire Ad genome has been deleted. These include vectors with 
"gutless," almost full-length genomes, which have been shown 
to mediate stable transgene expression in vivo (32), as well as 
encapsidated Ad minichromosomes with genomes of —13 kb, 
which also have successfully been used for gene transfer in 
vitro and in vivo (8, 17, 18). Both vector systems require helper 
viruses and several serial passages for production. 

As an application of our finding that IRs can mediate pre- 
dictable genetic rearrangements within Ad genomes, we dem- 
onstrate here the efficient and straightforward production of 
new vectors representing small Ad genomes devoid of all viral 
genes, which are packaged into functional Ad capsids. 

MATERIALS AND METHODS 

Production and characterization of viral vectors. Plasm ids. Sequences of the 
metal responsive element (MRE) and HS-4 element were taken from the plas- 
mids pMRENeo (33) (provided by Richard Palmiter, University of Washington) 
and pJC5-4 (GenBank accession no. U78775) (a gift from G. Felsenfeld, Na- 
tional Institutes of Health). The human a! -antitrypsin (hAAT) cDNA linked to 
a bovine growth hormone (bPA) gene polyadenylation signal was derived from 
pBShAAT (14). 

All Ad shuttle vectors were based on pAElspla (Microbix, Toronto, Canada). 
The construction of the vectors Ad.hAATa/b, Ad.Insl/la/b, Ad.Insl/3a/b, 
Ad.Ins2/la/b, Ad.Ins2/2a/b, and Ad.IR(G/C) is described elsewhere (25, 35). 
Vectors containing shortened IRs [Ad.IR(l.O) and Ad.IR(O.S)] were generated 
by the digestion of pIns2/2 (35) with Aflll and #i>idIII. The isolated Aflll 
fragment containing shortened insulator sequences and the MREhAAT cassette 
was blunted (T4 polymerase) and ligated to the EcoRV site of pAElspla 
[pAd.IR(0.5)]< Partial digestion of pIns2/2 with Hin&Ul revealed the 4.1-kb 
fragment containing two shortened insulators flanking the MREhAAT cassette. 
The fragment was ligated to the //mdlll site of pAElspla[pAd.IR(1.0)]. 

For the construction of Ad.IR-bGal, a transgene cassette containing the p-ga- 
lactosidase ((5-Gal) cDNA fused to the simian virus 40 (SV40) polyadenylation 
signal (SV40pA) was generated. The SV40pA was cut out from pREP4 (Invitro- 
gen, Carlsbad, Calif.) by XhoVSatl digestion and was ligated to the Xhol site of 
pBluescript SK(+) (Stratagene, La Jolla, Calif.), resulting in pBS-SV40pA. A 
3.7-kb Bam HI fragment containing the bGal cDNA was derived from pCM- 
VbGal (24) and was inserted into the corresponding site of pBS-SV40pA, 
resulting in pbGal-SV40pA. Ligation of a bGal-SV40pA containing a 4.1-kb 
XbaUKpnl fragment of pbGal-SV40pA to the corresponding sites of pSLJCb (35) 
resulted in pJC(l)bGal. pJC(2)bGal was created by insertion of an HS-4 con- 
taining a SpeVNhel fragment derived from pSLJCa (35) into the Spel sites of 
pJC(l)bGal. In the right orientation, the complete cassette holding two inverted 
HS-4 elements flanking bGal-SV40pA was cut from pJC(2)bGal by SpeVNhel 
digestion and was inserted into the Xbal site of pAd.RSV (14), generating 
pAdJR-bGal. 

Ads. First-generation viruses with the different transgene cassettes incorpo- 
rated into their El regions were generated by recombination of the pAElspla- 
derived shuttle plasmids and pJM17 or pBHGlO (Microbix) in 293 cells as 
previously described (27). For each virus, at least 20 plaques were picked, 
amplified, and analyzed by restriction digest. Plaques from viruses with the 
correct genome structure were amplified, CsCl banded, and titered (in PFU per 
milliliter) as previously described (14, 27). All virus preparations tested negative 
for replication-competent Ad and bacterial endotoxin (28). Virus was stored at 
-80°C in a solution containing 10 mM Tris-Cl (pH 7.5), 1 mM MgQ 2 , and 10% 
glycerol. 

To produce purified AAd.IR-1, 293 cells were infected with Ad.2/2a at a 
multiplicity of infection (MOI) of 10 PFU/cell and were harvested 40 h after 
infection. Cells were lysed in phosphate-buffered saline by four cycles of freezing 
and thawing. Lysates were centrifuged to remove cell debris and were digested 
for 30 min at 37°C with 500-U/ml DNase I and 200-u.g/ml RNase A in the 
presence of 10 mM Mgd 2 . Five milliliters of lysate was layered on a CsCl step 



gradient (0.5 ml at 1.5 g/cm 3 , 2.5 ml at 1.35 g/cm 3 , and 4 ml at 1.25 g/cm 3 ) and 
ultracentrifuged for 2 h at 35,000 rpm (rotor SW41). CsQ fractions were col- 
lected by puncturing the tube and were analyzed for viral DNA (27) or were 
subjected to ultracentrifugation at 35,000 rpm for 18 h in an equilibrium gradient 
with 1.32 g of CsQ per cm 3 . The band containing the deleted virus AAd.IR-1 was 
clearly separated (0.5-cm distance) from other banded viral particles containing 
full-length Ad.Ins2/2a genomes. Fractions containing deleted virus particles were 
dialyzed against a solution containing 10 mM Tris-Cl (pH 7.5), 1 mM MgCl 2 , and 
10% glycerol and were stored at -80°C. The genome titer of AAd.IR-1 prepa- 
rations was determined based on quantitative Southern analysis of viral DNA 
purified from viral particles in comparison to different concentrations of a 1 .7-kb 
hAAT-bPA fragment of pBShAAT (for AAd.IR-1) according to a protocol 
previously described (27). Titers were routinely obtained in the range of 3 x 10 12 
to 8 x 10 12 genomes per ml. Assuming one genome is packaged per capsid, the 
genome titer equals the particle titer. The level of contaminating Ad.Ins2/2a in 
AAd.IRl preparations was less than 0.1% as determined by Southern analysis, 
which is consistent with results obtained by plaque assay of 293 cells (fewer than 
five plaques per 10 6 total genomes). 

Primers used for sequencing the AAd.IR-1 genome, specific to Ad5 nucleo- 
tides (nt) 319 to 338 and 3550 to 3531, were AdF, 5 '-TTGTGTTACTCATAG 
CGCGT, and AdR, 5'-TTCTTTCCCACCCTTAAGCC. The nested primers to 
obtain the complete sequence of the IR elements in AAd.IR-1 were 5' (nt 552) 
TGACATTGTTGGTCTGGC and 5' (nt 947) GAAAAGCTCCAAGATCCC. 

Electron microscopy. For examination of viral particles in the transmission 
electron microscopy studies, CsCl-purified virions were fixed with glutaraldehyde 
and were stained with uranyl acetate as described previously (27). 

Cell culture. SKHEP-1 cells (HTB-52; American Type Culture Collection, 
RockvHle, Md.), an endothelial cell line derived from human liver (10), were 
grown in high-glucose Dulbecco's modified Eagle medium with 10% fetal calf 
serum. 

Analysis of viral DNA Lysates from 2 x 10 5 cells that had developed complete 
cytopathic effect (CPE) after viral infection or viral material banded in CsQ 
gradients were digested with pronase (1 mg/ml in a solution containing 10 mM 
Tris-Cl (pH 7.4), 10 mM EDTA (pH 8.0) and 1% sodium dodecyl sulfate) for 2 h. 
DNA was extracted with phenol-chloroform and was precipitated in ethanol. 
DNA samples were then subjected to gel electrophoresis or restriction analysis. 

Southern blotting. Cultured cells were washed three times with phosphate- 
buffered saline before harvesting. For analysis, 10 u-g of genomic DNA was 
digested with restriction endonucleases at 37°C overnight and then electropho- 
resed in a 0.8% agarose gel and transferred to a nylon membrane (Hybond N + ; 
Amersham, Arlington Heights, 111.). The blots were hybridized in rapid hybrid- 
ization buffer (Amersham) with [a- 32 P]dCTP-labeled DNA probes (>10 8 
cpm/u,g of DNA). The fragment used for labeling was the 1.7-kb hAAT-bPA 
fragment of pBShAAT. 

hAAT ELISA. hAAT concentrations in cell culture supernatants were deter- 
mined by enzyme-linked immunosorbent assay (ELISA) as previously described 
(14). 

RESULTS 

Considering the unique mechanism by which Ad replicates 
its genome, involving single-stranded genomic intermediates 
able to form intra- and intermolecular hybrids, we decided to 
study in more detail whether repeated sequences inserted into 
the viral genome would affect replication of viral DNA. To this 
end, a number of first-generation Ad vectors were used that 
contained single or repeated copies of a 1.2-kb chicken globin 
HS-4 element (3) inserted into the El region together with a 
reporter gene cassette (Fig. 1A). These vectors were originally 
designed for an unrelated study that employed the HS-4 ele- 
ment as an insulator to shield a heterologous, inducible pro- 
moter from interference by Ad enhancers (35). Control vectors 
consisted of the promoterless transgene only (Ad.hAATa) 
or the transgene expression cassette combined with one 
HS-4 element (Ad.Insl/la and Ad.Insl/3a). Ad.Ins2/2a and 
Ad.Ins2/2b contained the HS-4 elements as IRs flanking the 
reporter gene cassette in leftwards or rightwards orientation, 
respectively. In Ad.Ins2/la and Ad.Ins2/lb, the transgene cas- 
sette was flanked by HS-4 DRs. In a first screening for abnor- 
mal vector replication products, viral DNA was isolated to- 
gether with chromosomal DNA from infected 293 cells after 
the development of full CPE and was analyzed by gel electro- 
phoresis (Fig. IB). The full-length (~ 35-kb) genome comi- 
grated with fragments of genomic DNA. Interestingly, a small 
(~5.7-kb) band appeared in DNA samples isolated from cells 
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FIG. 1. Formation of genomic derivatives during replication of first-generation vectors carrying two IRs. (A) Structure of first-generation Ad vectors with transgene 
cassettes inserted into the El region. The transgene cassette (hAAT) comprises an inducible MRE promoter (33) linked to hAAT cDNA and a polyadenylation signal 
derived from the bPA. The IR sequence represents the 1.2-kb fragment of the HS-4 domain derived from the chicken 0-globin locus (3). Ad.Insl/la and Ad.Insl/3a 
contain only one HS-4 element upstream or downstream of the transgene cassette. Ad.Ins2/2a, Ad.Ins2/2b, Ad/Ins2/la, and Ad.2/lb each contain two HS-4 elements 
as IRs or DRs, respectively. Ad.hAATa contains only the hAAT cDNA and a polyadenylation signal derived from the bPA (B) Replication studies. 293 Cells were 
infected with Ads at an MOI of 10 PFU/cell. Thirty-six hours postinfection (p.i.), total DNA was isolated from infected cells and analyzed (undigested) by 
electrophoresis in agarose gels stained with ethidium bromide. The sample volume loaded per lane corresponds to the amount of DNA isolated from ~ 20,000 infected 
cells. The full-length (35-kb) viral genome comigrates with fragments of cellular DNA. The specific replication derivatives from Ad.Ins2/2a and -b with lengths of 5.7 
kb are marked. (C) 293 cells were infected, and DNA was analyzed as in panel B. To inhibit viral replication, hydroxyurea at a final concentration of 10 mM was added to 
the 293 culture medium 1 h postinfection. (D) 293 cells were infected with Ad.Ins2/2a at different MOIs (PFU per cell). DNA analysis was performed as described in 
panel B before Ad replication started (6 h postinfection) and after the initiation of replication (24 h postinfection) (34). M, 1-kb DNA ladder (Gibco BRL, Grand Island, 



after infection with Ad.Ins2/2a and Ad.Ins2/2b, both of which 
contained HS-4 elements as IRs. These bands were absent in 
cells infected with the control vectors or the vectors containing 
two HS-4 DRs. The 5.7-kb bands were identified as Ad vector 
derivatives by Southern blotting with a transgene specific probe 
(data not shown). These derivative genomes are hereafter re- 
ferred to as AAd.IR-1 (for the Ad.Ins2/2a derivative) and 
AAd.IR-2 (for the Ad.Ins2/2b derivative). 

Quantitative Southern analysis revealed that —5 X 10 4 of 
the 5.7-kb AAd.IR-1 or AAd.IR-2 genomes and ~10 5 corre- 
sponding full-length genomes were produced per cell after 
infection with the corresponding first-generation vector at an 
MOI of 10. The appearance of these small vector genomes was 
linked to adenoviral DNA replication, because it was absent 
when hydroxyurea, an inhibitor of viral DNA replication, was 
added to the 293 culture medium after infection (Fig. 1C). 
The amount of AAd.IR-1 genomes produced was analyzed 6 
and 24 h after infection of 293 cells with different MOIs of 



Ad.Ins2/2a (Fig. ID). AAd.IR-1 genomes were absent when 
cell lysates were analyzed before replication started (6 h post- 
infection). At 24 h postinfection, the number of AAd.IR-1 ge- 
nomes increased when the viral dose was between MOI 1 and 
10 and reached a plateau after infection with higher MOIs (50 
to 500). 

DNA restriction analysis and sequencing of the 5.7-kb viral 
genomes revealed the genome structure shown in Fig. 2A. 
Notably, the 4.0-kb Notl and the 1.4-kb BamHl fragments were 
specific for AAd.IR-1 and AAd.IR-2 and were absent from the 
full-length genome and the original shuttle plasmid (Fig. 2B). 
Both deleted vectors, AAd.IR-1 and AAd.IR-2, contained the 
transgene cassette flanked by the inverted HS-4 elements, 
which are linked on both sides to two identical inverted copies 
of Ad DNA comprising the Ad ITR and packaging signal. 
Importantly, these genomes were devoid of all Ad sequences 
that encode viral proteins. This structure was confirmed by 
sequencing the Notl fragments of both AAd.IR genomes with 
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contains the pAElspla backbone. Ad packaging signal; IR, HS-4 IR; hAAT, transgene expression cassette; pIX, gene for Ad pIX protein. All fragment sizes are 
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primers specific to the Ad packaging region or primers specific 
to regions within the transgene cassette (Fig. 2A). The se- 
quencing data demonstrated an accurate mechanism for the 
duplication of the IRs in conjunction with the Ad packaging 
signal and ITR. 

The appearance of AAd.IR genomes with identical, dupli- 
cated regions was linked to viral DNA replication and required 
the presence of IRs. DRs did not mediate this process. Based 
on these results, we hypothesized that the unique structure of 
AAd.IR could be the result of homologous recombination pro- 
cesses stimulated by the IRs flanking the transgene cassette 
(Fig. 3). This process could involve the formation of a Holliday 
structure, which can be resolved by a classical isomerization 
process (11, 16) or during Ad replication. 

If this model is correct, recombination products should ap- 
pear in cells coinfected with two Ads; one containing a se- 
quence in leftward orientation, and the other containing an 
identical sequence in rightward orientation with respect to the 
Ad ITR and packaging signal (Fig. 4A). To test this hypothesis, 
vectors Ad.Insl/3a and Ad.Insl/3b containing one HS-4 ele- 



ment and the hAAT transgene cassette in leftward or right- 
ward orientation, respectively, were added onto 293 cells sep- 
arately or in combination. Viral DNA was analyzed together 
with cellular DNA after development of CPE, as described in 
Fig. 1. As expected, no small vector derivatives were detected 
in cells infected separately with Ad.Insl/3a or Ad.Insl/3b. 
Importantly, a 4.2-kb deleted vector genome was generated 
in cells after simultaneous infection with the two vectors 
(Ad.Insl/3a plus Ad.Insl/3b; Fig. 4B). This product can only 
form when the two double-stranded genomes (Ad.Insl/3a 
and Ad.Insl/3b) recombine via the HS-4-hAAT homology re- 
gion. The amount of 4.2-kb deleted vector genomes was similar 
to the amount observed for AAd.IR-1/2 (Fig. 1A). A corre- 
sponding recombination product appeared in cells coinfected 
with Ad.Insl/1 and Ad.Insl/lb containing the hAAT cassette 
followed by the HS-4 element. To demonstrate that recombi- 
nation is not associated with the specific sequence or structure 
of the HS-4 element and that recombination can be mediated 
by other sequences, vectors Ad.hAATa and Ad.hAATb were 
employed. These vectors contained 1.4-kb hAAT cDNA seg- 
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ments linked to 0.3-kb bPA polyadenylation signals in left- or 
rightward orientation. In cells coinfected with Ad.hAATa and 
Ad.hAATb, this 1.7-kb hAAT-bPA region of homology also 
efficiently mediated the formation of small deleted vectors. 
The structure of the deleted genomes formed after coinfection 
with two viruses was confirmed by restriction analysis. Rep- 
resentative data were shown in Fig. 4B. 
In conclusion, vectors deleted for all viral genes are effi- 



ciently formed by a process that appears to involve homolo- 
gous recombination between two IRs present in one vector or 
by recombination between independent vectors, each contain- 
ing one inverse homology element. Importantly, in contrast to 
recombination within a single parental vector carrying two IRs, 
recombination between two coinfected vectors results in the 
formation of a hybrid AAd.IR carrying elements from both 
parental vectors. 
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FIG. 4. Formation of genomic derivatives by coinfection of two parental vectors each carrying only one homology element. (A) The structures of Ad.hAATa, 
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The recombination model for the formation of the described 
AAd.IR genomes relies on an extended homology between the 
IRs. In order to assess whether the formation of genomic 
derivatives quantitatively depends on the length of the homol- 
ogous elements, additional vectors were generated that con- 
tained shorter IR elements than Ad.Ins2/2a (Fig. 5). The 
shorter IRs derived from the HS-4 fragments, with lengths of 
1.0 or 0.5 kb, allowed for the generation of the AAd.IR ge- 
nomes at the same rate as the 1.2-kb HS-4 elements used in the 
experiment described in Fig. 1A. The corresponding repli- 
cation derivatives had the expected sizes of 3.9 and 4.9 kb, 
respectively. However, a vector that contained very short 
IRs consisting of stretches with 20 bp of dC and dG did not 
yield detectable replication derivatives, indicating that cer- 
tain lengths of IRs are required for AAd.IR formation. Nota- 
bly, both the 1.0- and 0.5-kb-long IRs were deleted for a GC- 
rich region located at the terminal ends of the 1.2-kb HS-4 
fragments (3) (Fig. 5). Together with the data shown in Fig. 4, 
this suggests that formation of AAd.IR genomes can be medi- 
ated by any sequence present as an IR in the El region of Ad 
vectors. To test whether the sequence of the intervening region 
between the IRs is critical for the formation of genomic deriv- 
atives, a vector was produced that had two inverted 1.2-kb 
HS-4 fragments flanking a 4.1-kb (5-Gal gene. During replica- 
tion of this vector, the expected 8.2-kb genome was formed as 
efficiently as the genome of Ad.IR-1 or Ad.IR-2. This demon- 
strates that IRs can be employed for the generation of deleted 
vectors containing a transgene cassette of choice. 

The presence of packaging signals in AAd.IR- 1 and 
AAd.IR-2 prompted us to test whether these viral genomes 
were packaged into virions that could be banded in CsCl gra- 
dients. 293 cells infected with Ad.Ins2/2a (MOI 10) were lysed 
48 h postinfection in order to liberate produced viral particles, 
and cell lysates were separated by ultracentrifugation in CsCl 
gradients to band and visualize viral particles. A prominent 
additional viral band with a buoyant density of ~1.32 g/cm 3 
appeared in CsCl step gradients between virions containing 
full-length (~35-kb) genomes and empty or defective particles 
(Fig. 6A). Analysis of viral material from purified particles 
contained in this band demonstrated the packaged 5.7-kb 
AAd.IR-1 genome (Fig. 6B). AAd.IR-1 particles could be sep- 
arated from contaminating particles with packaged full-length 
genomes (Ad.Ins2/2a) by an additional CsCl equilibrium gra- 
dient (Fig. 6B, lane 3). Based on quantitative Southern analysis 
of DNA isolated from CsCl banded particles, ~10 4 packaged 
AAd.IR-1 genomes and ~5 x 10 4 packaged full-length ge- 
nomes were produced per cell (data not shown). Considering 
the amount of corresponding genomes found at the time of 
virus harvest inside the cell (Fig. IB), this indicates that both 
the 5.7- and the 35 -kb genomes were packaged efficiently. Final 
preparations of AAd.IR-1 after two CsCl gradient purifications 
contained less than 0.1% contaminating first-generation 
Ad.Ins2/2a, as analyzed by plaque assay (see Materials and 
Methods). CsCl purification of other AAd.IR vectors resulted 
in similar findings (data not shown). 

Electron microscopy of AAd.IR-1 particles demonstrated 
the same icosahedral shape as the first-generation, Ad.Ins2/2a, 



particles (Fig. 7). Staining with uranyl acetate allows the cen- 
tral viral cores to appear electron dense. While the lumina of 
particles containing the full-length genomes were homoge- 
nously electron dense, virions containing the smaller genomes 
had only spotted luminal staining, indicating the presence of 
less packaged DNA in AAd.IR-1 particles. We speculate that 
only one deleted genome is packaged per virion. 

As a further test for the intactness of AAd.IR-1 particles, we 
measured the ability to mediate gene transfer into cultured 
cells based on reporter gene (hAAT) expression (Fig. 8). Con- 
fluent SKHEp-1 cells were infected with purified AAd.IR-1 
and Ad Jns2/2a particles at an MOI of 2,000 genomes per cell. 
This cell line does not significantly support the replication of 
first-generation Ad (30). The level of hAAT expression at day 
3 after infection was comparable for both vectors, indicating 
that in vitro gene transfer was similarly efficient. While trans- 
gene expression from the full-length vector was stable during 
the analyzed time period (7 days), hAAT expression declined 
gradually for AAd.IR-1 starting at day 4 postinfection. South- 
ern analysis of viral DNA isolated from infected cells revealed 
that the short duration of transgene expression was due to the 
instability of AAd.IR-1 genomes within transduced cells (Fig. 
9). The concentration of full-length viral genomes (Ad.Ins2/2a) 
was comparable in cells harvested at day 1 and day 7 postin- 
fection. In contrast, while the input concentration of AAd.IR-1 
genomes analyzed at day 1 postinfection was as high as for 
first-generation vectors, the number of AAd.IR-1 vector ge- 
nomes was barely detectable in transduced cells at day 7 post- 
infection. 

In conclusion, small AAd.IR genomes are efficiently formed 
and packaged during replication of El-deleted Ad vectors con- 
taining two IRs flanking a reporter gene cassette. The mech- 
anism of formation of AAd.IR requires viral DNA replication 
and most likely involves homologous recombination. AAd.IR 
formation can be achieved using IRs of various lengths and 
origins. The inverted homology elements required for AAd.IR 
generation can also be provided in trans by the coinfection of 
two independent viruses, resulting in a hybrid AAd.IR. Parti- 
cles containing the small genomes devoid of all viral genes 
could be separated from virions with full-length genomes 
based on their lighter buoyant density. These particles infected 
cultured cells with the same efficiency as first-generation vec- 
tors; however, deleted genomes were only short-lived within 
transduced cells. The production of high titers by using two IRs 
is technically straightforward and does not require helper vi- 
ruses, because all functions required for the replication of the 
small genome and for particle formation are provided from the 
full-length genomes amplified in the same cell. 



DISCUSSION 

We demonstrated that IRs inserted into first-generation Ad 
vector genomes mediated precise genomic rearrangement, re- 
sulting in vector genomes that were devoid of all viral genes 
and which were efficiently packaged into functional Ad capsids. 
This finding has practical implications for Ad vector develop- 



production of the indicated products, since the inverse homology elements of a/b vector combinations can pair as proposed for vectors carrying IRs (Fig. 3). 293 cells 
were infected with the indicated vector combinations at an MOI of 50, and viral DNA was analyzed as described in Fig. 1 . Infection with single vector types carrying 
one homology element did not yield any genomic derivatives. Infection with a/b vector pairs generated the expected vector derivatives of 2.4 kb (Ad.hAATa/b) and 4.2 
kb (Ad.Insl/la/b and Ad.Insl/3a/b). (B) The structure of the subgenomic (4.2-kb) Ad genome resulting from coinfection of Ad.Insl/la and Ad.Insl/lb was analyzed 
by BamHl restriction analysis as described in Fig. 2. For comparison, the BamHX digests of the corresponding pAElspla-based shuttle plasmids used for the generation 
of Ad.Insl/la (pAd.Insl/la, lane 1) and Ad.Insl/lb (pAd.Insl/lb, lane 2) are shown. The BamHX fragments specific for the 4.2-kb recombination product [A(Ad.Insl/ 
la + Ad.Insl/lb)] (lane 3) represent the double 0.45-kb band and the combination of the 0.2-, 1.4-, and 1.7-kb bands. 
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FIG. 5. Effect of shortened IRs on the formation of vector derivatives. 293 cells were infected with Ad vectors, and the viral DNA was analyzed as described in Fig. 
1. Ad.IR(l.O) contained HS-4-derived IRs with lengths of 1 kb. Ad.IR(0.5) contained 0.5-kb-long IRs. The restriction sites for Hindlll and Afill present within the 
1.2-kb HS-4 element that were used to produce the shorter IRs are indicated. The proximal 200 nt of the HS-4 fragment contain a GC-rich region (chart at the top 
of the figure). This region and about 150 bp of flanking polylinker are deleted in Ad.IR(l.O) and Ad.IR(0.5). The vector Ad.IR(G/C) contains synthetic 20-mer dG and 
dC stretches flanking the transgene cassette. The vector derivatives have lengths of 4.9 kb for Ad.IR-l(l.O) and 3.9 kb for Ad.IR(0.5). The structure of the parental 
vectors is shown on the right. Ad.bGaJ contained the 1.2-kb HS-4 elements flanking a 4.1-kb lacZ cassette inserted into pAd.RSV (14). The corresponding rearranged 
vector derivative had the expected length of 8.2 kb. 



merit and may contribute to a better understanding of Ad 
replication and the functional importance of IRs. 

AAd.IR genomes contained only the transgene cassette 
flanked on both sides by duplicated IRs, Ad packaging signals, 
and ITRs. This specific structure could be generated precisely 
and reproducibly by using IRs of different sizes and origins, but 
not by DRs. These findings implicated the involvement of 
homologous recombination in the formation process (Fig. 3). 

This hypothesis was confirmed by coinfection with two vec- 
tors, each containing only one region of homology. The 
AAd.IR genomes detected after coinfection could only have 
formed by both vectors recombining through oppositely orien- 
tated homology elements. This genomic rearrangement pro- 
cess could involve the formation of a Holliday structure whose 
formation and stabilization could be supported by the Ad 
DBP, which is known to enhance intermolecular interactions 
(39). We postulate that the Holliday structure could be re- 
solved by classical isomerization mediated by cellular recom- 
bination enzymes, which are highly conserved during evolution 
(11, 16). Alternatively, the unique mechanism of Ad DNA 
synthesis may account for the efficient resolution of a Holliday 
structure, as outlined in Fig. 3. In this context, it would be 
interesting to test whether similar genomic rearrangements 
mediated by IRs can be achieved with other DNA viruses 
(HSV, SV40, or polyomavirus) which use different replication 
strategies. 

At the late stages of viral infection with a relatively low MOI, 
~5 x 10 4 AAdJR genomes were produced per cell, which is 
only twofold less than the number of full-length genomes pro- 
duced per cell. This implies that either the event that forms 



AAdJR genomes occurs very frequently or that only a small 
number of rearranged genomes are originally formed and later 
amplified by the Ad replication machinery. The AAdJR ge- 
nomes are approximately six times shorter than the full-length 
genomes and could therefore have a replicative advantage. 
Previous studies have demonstrated that small Ad vector ge- 
nomes are replicated by viral proteins expressed from full- 
length genomes present within the same infected cell (4, 6, 9, 
26, 27). This supports the hypothesis that the vector rearrange- 
ment is a rare event and that AAd.IR genomes are amplified 
together with full-length genomes in transduced cells. This is 
further supported by the low frequency of recombination seen 
between Ad shuttle plasmids used for the generation of recom- 
binant Ads. The critical importance of AAdJR genome repli- 
cation is also underscored by the observation that the amount 
of generated AAdJR genomes correlated with the kinetics of 
Ad replication. The number of AAd.IR genomes generated 
increased with the viral dose between 1 and 10 PFU/cell and 
reached a plateau when infection MOIs were greater than 10. 
In this context, it is notable that Ad replication starts only if a 
certain threshold of early viral protein has accumulated and 
reaches a plateau that is dictated by limiting viral and cellular 
factors (37). 

Our data does not exclude other mechanisms for the forma- 
tion of AAdJR genomes. Particularly intriguing is the unique 
mechanism of Ad replication, involving single-stranded inter- 
mediates, that can form intramolecular secondary structures. 
In this context, stem-loop or cruciform-like structures formed 
through intrastrand hybridization of IRs may be functionally 
important in the formation of AAdJR genomes. Elongation by 
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FIG. 6. Isolation of particles containing packaged AAd.IR-1 genomes. (A) Ad.Ins2/2a was amplified in 293 cells and banded by ultracentrifugation in a one-step 
CsCl gradient. The lower band contains full-length genome and Ad.Ins2/2a particles, the middle band contains AAd.IR-1 particles, and the top band contains empty 
and defective viral particles. Viral material from 3 x 10 8 293 cells collected 36 h after infection with Ad.Ins2/2a at an MOI of 50 PFU/cell is shown. (B) Undigested 
viral DNA purified from banded viral material after centrifugation in CsCl gradients was analyzed in 0.8% agarose gels. Lane 1, DNA isolated from 10 uJ of banded 
Ad.Ins2/2a (full-length genome, 35 kb); lanes 2 and 3, DNA isolated from 10 \l\ of the banded AAd.IR-1 particles (5.7-kb genome) after a one-step CsCl gradient (lane 
2) or after additional purification of AAd.IR-1 virions by ultracentrifugation in a CsCl equilibrium gradient (lane 3). 



Ad Pol is relatively slow, and it may provide a lag time suffi- 
cient to form stem structures within single strands during their 
displacement. Hypothetical ly, Ad Pol can pause at the IR- 
stimulated hairpin structure and switch template strands. Sim- 
ilar mechanisms have been described for other DNA poly- 
merases (1, 13, 19, 38) and for retroviral reverse transcriptases 
(7, 12, 15). Although the involvement of intramolecular stem- 
loop or cruciform-like structures formed by IRs present within 
single-stranded replication intermediates appeared to be an 
attractive basis for the explanation of the AAd.IR structure, 
the data obtained with coinfected Ad viruses containing only 
single homology regions contradicted this hypothesis. None- 
theless, local formation of intrastrand secondary structures 
may initiate or support recombination processes. Clearly, our 
data demonstrates that Ad replication is required for the high- 
level production of AAd.IR genomes in infected cells, either as 
the etiological event responsible for the genomic rearrange- 
ments or as a supportive mechanism for the amplification of 
rearranged genomes. 

The structure of AAd.IR particles revealed by electron mi- 
croscopy and their buoyant density in CsCl gradients clearly 
differ from empty particles. Furthermore, we demonstrated 
that DNase I-treated AAd.IR virions efficiently transferred 
their genomes into cells, as shown by Southern blotting and 
transgene expression. These facts prove that AAd.IR vector 
genomes, which contain two Ad packaging signals, were pack- 
aged into Ad capsids. While the number of deleted AAd.IR 
and corresponding full-length genomes produced per cell dif- 
fered only by a factor of 2, the ratio of full-length genome 
particles to AAd.IR particles in CsCl gradients was 5:1 to 10:1. 
This indicates that packaging of the small genomes was 2.5- to 
5-fold less efficient than that of full-length genomes. These 
numbers are in agreement with a study by Parks and Graham 
(31) in which plasmids carrying Ad genomes of different sizes 
were used in combination with helper virus to determine the 
lower packaging limit for Ad vectors. Vectors of fewer than 
27 kb were recovered with about half the efficiency of larger 



vectors. Interestingly, a 15-kb genome was packaged at a high- 
er efficiency than were the 20- to 25-kb-long vectors. However, 
the work of Parks and Graham demonstrated a clear disad- 
vantage in the amplification of genomes of less than 25 kb 
during multiple virus passages. Yet, from this experiment, it 
was not clear whether the smaller vectors were less efficiently 
replicated or less efficiently packaged. The results of that study 
are difficult to compare with those of our AAd.IR vectors, 
which start out full length and are deleted in the producer cells 
(perhaps after a critical event required for packaging has oc- 
curred) during one round of large-scale amplification. Packag- 
ing of a 9-kb mini- Ad vector generated by Cre-lox recombina- 
tion (27) or of encapsidated Ad chromosomes (4, 6, 17) has 
been previously reported. 

AAd.IR vectors were produced by standard techniques for 
first-generation adenovirus amplification and purification. All 




FIG. 7. Electron microscopy of Ad particles. Particles with Ad.Ins2/2a (A) 
and AAd.IR-1 (B) genomes. Virions were purified by two rounds of ultracen- 
trifugation CsCl gradients. Magnification, X 100,000. 
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FIG. 8. Expression of hAAT after infection with Ad.Ins2/2a and AAd.IR-1. 
Confluent SKHEP-1 cells (10 6 ) were infected with purified Ad.Ins2/2a and 
AAd.IR-1 virus at an MOI of 2,000 genomes per cell, and hAAT concentrations 
were determined in culture supernatants by ELISA. Culture media was supple- 
mented with 150 jjlM ZnS0 4 for induction of hAAT expression and was changed 
daily. Filled squares, cells infected with Ad.Ins2/2a; empty circles, cells infected 
with AAd.IR-1. Results are based upon three independent experiments. 

the functions required for AAd.IR genome generation and 
replication and particle formation are provided from the full- 
length genomes amplified in the same cell. The efficiency of 
vector production was 10 4 packaged genomes per cell or >1 X 
10 13 packaged genomes produced in a large-scale preparation 
after one round of infection with first-generation vector. 
Banded particles containing the genomic derivatives were 
clearly separated in CsCl gradients based on their lighter buoy- 
ant density, which allowed for their purification without con- 
tamination with first-generation virus containing full-length ge- 
nomes (Fig. 2B). In this context, the production of high- titer 
vectors devoid of all viral genes is less labor intensive than 
helper-dependent production of gutless vectors (32) or pack- 
aged adenovirus minichromosomes (17), both of which require 
multiple passages of virus. 



17 17 



^ — 1.7kb 



full-length AAd.IR-1 

FIG. 9. Analysis of viral DNA in transduced cells. Confluent SKHEP-1 cells 
were infected with purified Ad.Ins2/2a and AAd.IR-1 virus at an MOI of 2,000 
genomes per cell. At days 1 and 7 after infection (indicated on top), genomic 
DNA was extracted. Ten-microgram samples were digested with Sam HI and 
analyzed by Southern blotting with an hAAT-specific probe. The vector-specific 
fragment is 1.7 kb. 



The AAd.IR-1 vector infected cells with the same efficiency 
as the corresponding first-generation vector based on a similar 
level of reporter gene expression at day 3 postinfection. How- 
ever, transgene expression declined over time due to the in- 
stability of AAd.IR-1 genomes in transduced cells. Nonethe- 
less, the high infectivity of AAd.IR-1 indicates that the viral 
structural elements present in AAd.IR particles are function- 
ally intact and mediate efficient cell entry, endosomal lysis, and 
nuclear import. This may allow for the efficient infection of a 
variety of cell types, including nondividing cells. The potential 
for highly efficient gene transfer, together with the fact that the 
AAd.IR vector genomes lack viral genes, make AAd.IR vectors 
practically important. For example, a transient transgene ex- 
pression would be sufficient for a variety of cell biology or cell 
cycle studies which require efficient gene transfer and the ab- 
sence of side effects associated with the expression of adeno- 
viral proteins. 

On the other hand, approaches aimed toward gene therapy 
of genetic disorders require stable gene expression. In agree- 
ment with the data presented here, we recently demonstrated 
that a 9-kb mini-Ad genome generated by Cre-lox recombina- 
tion was packaged into Ad particles that transduced cells effi- 
ciently; however, they were short lived and were completely 
degraded by day 7 after in vitro infection (27). We suggested 
that the expression of certain viral proteins, including pTP, is 
required to confer genome stability in transduced cells (26). 
Furthermore, in a study related to the present paper, we uti- 
lized adeno-associated virus elements in combination with the 
described rearranged vectors to mediate integration as a 
means of vector stabilization allowing for stable transgene ex- 
pression. 

The strategy of AAd.IR generation using coinfection of vi- 
ruses each containing one inverse homology element can be 
used to combine elements of choice from two independent 
viruses. Because the packaging capacity of both parental vec- 
tors could be exploited, large inserts could be accommodated 
by AAd.IR vectors. In order to test this, we are currently 
generating AAd.IR vectors containing 12- to 15-kb transgenes. 
Furthermore, promoter and transgene could be placed into 
separate vectors so that the transgene would not be expressed 
during generation and amplification of the parental vectors 
unless both vectors were coinfected. This strategy may be ex- 
tremely useful whenever transgene expression is toxic to pro- 
ducer cells. 

This study demonstrates proof of the principle that IRs can 
be used to create predictable genetic rearrangements within 
the framework of Ad replication. This method allows for the 
reliable and efficient generation of vectors devoid of all viral 
genes and has potential application in the development of 
vectors for gene therapy. 
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While inverted DNA repeats are generally acknowledged to be an Important source of genetic instability in 
prokaryotes, relatively little is known about their effects in eukaryotes. Using bacterial transposon Tn5 and its 
derivatives, we demonstrate that long inverted repeats also cause genetic instability leading to deletion in the 
yeast Saccharomyces cerevisiae. Furthermore, they induce homologous recombination. Replication plays a 
major role in the deletion formation. Deletions are stimulated by a mutation in the DNA polymerase 8 gene 
tpol3). The majority of deletions result from imprecise excision between small (4- to 6-bp) repeats in a polar 
fashion, and they often generate quasipalindrome structures that subsequently may be highly unstable. 
Breakpoints are clustered near the ends of the long inverted repeats (<150 bp). The repeats have both intra- 
and interchromosomal effects in that they also create hot spots for mitotic interchromosomal recombination. 
Intragenic recombination is 4 to 18 times more frequent for heteroalleles in which one of the two mutations is 
due to the insertion of a long inverted repeat, compared with other pairs of heteroalleles in which neither 
mutation has a long repeat. We propose that both deletion and recombination are the result of altered 
replication at the basal part of the stem formed by the inverted repeats. 



Inverted repeats with various levels of homology are 
common in the DNA of many eukaryotes and also in human 
DNA and could be an important source of genetic instability 
within eukaryotic genomes or within fragments cloned into 
other organisms. Long inverted repeats (LIRs) and short 
inverted repeats in prokaryotes have unique genetic proper- 
ties leading to precise and imprecise excision of the regions 
associated with the repeats. For example, the 1.5-kb LIRs of 
insertions of bacterial transposons Tn5 and TnJ0 lead to 
precise and imprecise excision through small (4- to 10-bp) 
direct repeats at or near the base of the LIRs (references 3 
and 22 and references therein). Inverted repeats that are 
either perfect palindromes or quasipalindromes (containing a 
short spacer between the repeats) frequently undergo dele- 
tion (10, 11). 

In order to examine the genetic consequences of LIRs in 
eukaryotes, we examined the fate of Tn5 inserted into the 
LYS2 gene of the yeast Saccharomyces cerevisiae (16-18). 
The low level of excision was greatly enhanced in strains 
containing mutations in the replicative DNA polymerase 
genes a and 8, suggesting a role for replication. Because the 
system was limited to the study of precise excision events 
(i.e., restoration of the Lys + phenotype), the more general 
consequences of LIRs, such as imprecise excision, stability 
of altered forms of the LIRs, and interchromosomal recom- 
bination as a more global effect of LIRs, could not be 
addressed. A LIR that included IS50 repeats of Tn5 and an 
internal genetic marker that allowed the selection of precise 
and imprecise excision events was therefore developed in 
the present study. As previously reported, the LIR is much 
more unstable in strains with the mutant DNA polymerase 5 
gene. However, compared with imprecise excision, precise 
excision accounts for only a small proportion of the genetic 



changes. Analysis of the imprecise excisants is consistent 
with a role for replication. The majority of imprecise ex- 
cisants result from excision between small (4- to 6-bp) 
repeats in a polar fashion, often leading to the formation of 
quasipalindrome structures (<300 bp) that subsequently may 
be highly unstable. Using the newly developed LIR and 
various derivatives, we demonstrate a novel feature of 
LIR-induced genetic instability, namely, that LIRs are also 
hot spots for mitotic interchromosomal recombination. 

MATERIALS AND METHODS 

Plasmlds and strains. To construct the Tn5(URA3) insert 
in the yeast LYS2 gene (diagrammed in Fig. 1), the central 
BgHl-BgRl 2,784-bp fragment of Ta5 in the plasmid 
pACYC184::Tn5 (17) was replaced by the 1,110-bp flg/II- 
Bgtll URA3 fragment from the plasmid pFL34 (5). The 
region removed from Tn5 contained the unique loop (2,748 
bp) and 20 bp of the I&50 LIR from each side of the loop. The 
lengths of the remaining LIRs of the Tn5(URA3) (including 
Bgtll sites) are 1,520 bp. The UR-URA3-UR fragment was 
created by cutting Tn5(URA3) with Hpal, which cuts 185 bp 
from the external ends of IS50. Subsequently, the Tn5-13 
insertion (17) in the chromosomal gene LYS2 was replaced 
by Tn5(URA3) by homologous recombination with the frag- 
ment described above, yielding LYS2::Tn5(URA3). The 
Tn5(URA3) insert is flanked by 9-bp direct repeats created 
when the Tn5 was originally integrated into LYS2 (18). The 
Tn5(URA3) replacements were made in a set of isogenic 
strains, POL + -DM MAT* Iys2-Tn5-13 leu2-2 trpl-AJ ura3-x 
(or ura3-b-Sma\-Hindl\\ deletion of all the URA3, leaving 
only 62 bp of 5' homology to the URA3 insert in Tn5; the 
deletion was constructed with the plasmid pJL164, provided 
by J. li) and pol3-t-DM (same genotypes as POL+-DM, but 
with pol3-t) (16). The replacements were confirmed by 
genetic and Southern analyses. The point mutation lys2~lll 
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FIG. 1. Insert Tn5(URA3) in the LYS2 gene and proposed mech- 
anism for precise and imprecise excision of the insert. The 
Tn5(URA3) insert has the central fragment of Tn5 replaced by the 
1,110-bp URA3 fragment (top left diagram). The lengths of the 
remaining LIR (thick vertical lines) are 1,520 bp. The To5-i5 
insertion (17) in the chromosomal gene LYS2 (thin horizontal lines) 
was replaced by Tn5(URA3) via homologous recombination, yield- 
ing LYS2::Tn5(URA3). Tn5(URA3) is flanked by 9-bp direct repeats 
created when the Tn5 was originally integrated into LYS2 (16). 
Single-strand regions that might appear during replication of the 
segment containing the LIRs (IS50 segments [thick vertical arrows]) 
are proposed to lead to complementary pairing of the LIRs (de- 
scribed in reference 16). We propose that the replicating lagging 
strand (dotted line with ball corresponding to 3' end) encounters the 
paired LIR. Slippage between the 9-bp direct repeats (black boxes 
with arrows) results in precise excision (A). Imprecise excision (B 
and C) can occur if the lagging-strand replication machinery enters 
the stem (see the text). Small repeats (dotted arrowheads) are 
frequent; upon occurrence of a repeat, there is the possibility of 
slippage to a small repeat in a region of single-strand DNA or of 
continued replication. If slippage of the newly synthesized strand 
occurs between small repeats internal to the LIR (i.e., l-»2) (B), the 
excisant will contain short inverted repeats and may be unstable for 
Lys~-to-Lys + reversion, as it can still undergo excision between the 
two 9-bp direct repeats. If slippage occurs between a small repeat 
internal to the LIR and an external repeat (i.e., 1-+3) (C), there will 
be no opportunity to generate Lys + and the excisant will be stable. 



(with a G-+A base pair change 92 bp in the 5' direction from 
the Tn5-7J insertion; the mutation was provided by Y. I. 
Pavlov) was transferred into the chromosomal LYS2 allele 
from the plasmid pLL12-/y.y2-7i7 by gene conversion as 
described in reference 17. 

Genetic and molecular procedures. Genetic and molecular 
procedures were described previously (16, 17). The ura3 



mutations were isolated on a selective medium containing 
5-fluoro-orotic acid (4); a fluctuation test (described in refer- 
ence 13) was used to determine the mutation rates. Eight or 
more independent single colonies or groups of up to 10 single 
colonies grown at either 30 or 20°C (pregrowth) were used 
for rate determinations. Rates of intragenic recombination of 
the fys2 heteroalleles were determined similarly. The fys2 
mutant alleles in the Lys* recombinants were identified as 
described in reference 9. For this purpose, the homozygotes 
for the mutant fys2 alleles from the independent Lys* 
recombinants were selected on a medium containing alpha- 
aminoadipic acid (7). The ability to recombine with appro- 
priate fys2 tester mutations was studied to identify the 
presence of each mutation of the initial pair of heteroalleles. 
To characterize the genetic stability of the inserts in the 
LYS2 genes, frequencies of reversion to Lys + were deter- 
mined as described previously (16, 17). To determine the 
breakpoints of imprecise excision, the region surrounding 
TnJ-25 was amplified in 25 cycles (2 min at 94°C, 1 min at 
55°C, and 3 min at 72°Q of polymerase chain reaction (PCR) 
with Vent polymerase (New England Biolabs) according to 
the manufacturer's instructions with the oligonucleotide 
primers (ByoSynthesis Inc.) GAGACGCTAOGAAGTTCG 
and CGGCTAAGCTCATAACAT, complementary to the 
sequences 185 bp 5' and 265 bp 3', respectively, from the 
ends of Tn5-i3. The sizes of PCR products were in agree- 
ment with results of the Southern analysis. PCR products 
were sequenced directly (30). Formamide (40%) was in- 
cluded in the sequencing gels when needed to resolve 
compression artifacts. 

RESULTS 

Deletions associated with LIRs. Previously, Gordenin et al. 
demonstrated that Tn5 as a model LIR was excised at a high 
rate in DNA polymerase-deficient mutants (16). However, 
only reversions to Lys + through precise excision (break- 
points in the duplicated 9 bp of the Tn5 target) or unique 
in-frame imprecise excisions associated with LIRs were 
detectable in this study. In order to understand the general 
consequences of LIRs, we developed a genetic system in 
which the complete spectrum of Tn5 deletions could be 
selected. For this purpose, a 3-kb internal region of Tn5 was 
replaced by URA3 [i.e., LYS2: :Tn5(URA3)] (diagrammed in 
Fig. 1). By selecting ura3 auxotrophs on 5-fluoro-orotic acid 
(4), a broad spectrum of genetic changes, including both 
precise (Ura~ Lys"*") and imprecise (Ura" Lys") excisants, 
could be identified. The Ura" Lys" category included mu- 
tations specific to the URA3 gene. The Ura" Lys" auxo- 
trophs could be categorized according to their ability to 
revert to Lys*, a feature allowing discrimination between 
excisants (described below). 

A defect in replicative polymerase 5 resulted in a 45-fold- 
higher rate of appearance of ura J mutants at the semiper- 
missive temperature (30°C) than at the permissive tempera- 
ture (Table 1). This contrasted with the low rate in the Pol* 
strain at 30°C. Only a small percentage of Ura" isolates were 
due to precise excision, i.e., were Ura" Lys*. As discussed 
below, nearly half of the events at the sernipennissrve 
temperature of growth for the pol3 mutant were due to 
imprecise excision. 

The Ura" Lys" mutants were genetically characterized 
according to their ability to revert to Lys* (the result of the 
precise excision of the remaining insert). For the pol3-t 
strain, there were three categories of Ura" Lys" mutants 
(Table 1). The first (with ura3 mutations) yielded high levels 
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TABLE 1. Frequencies and categories of Ura~ mutants arising 
from the insert LYS2::Tn5(URA3) in Pol + and pol3-t strains" 



% of Ura" isolates (no. of events) 
Strain and Ura" generated by: 



pregrowth 
temp 


mutation 
rate (10 6 ) 


ura3 
mutation 


Precise 
excision 
(Lys + ) 


Imprecise excision* 
Stable Unstable 


pol3-t 












20°C 


1.4 


78 (72) 


1.0(1) 


7.0 (6) 


14 (13) 


30°C 


63 


52 (109) 


1.4 (3) 


4.7 (10) 


42 (90) 


Pol + 












30°C 


0.12 


95 (55) 


1.7 (1) 


0(0) 


3.3 (2) 



" The pol3~t and Pol' 1 ' strains are the derivatives of pol3-t-DM and 
POL* -DM strains with the LYS2::Tn5{UR43) insertion. The ura3 mutations 
were obtained on a selective medium containing 5-fluoro-orotic acid (4); a 
fluctuation test (described in reference 13) was used to determine the mutation 
rates. Since the rates for the strains carrying ura3-x and those carrying the 
ura3 deletion did not differ significantly, the results were combined. The 
characterization of the reversion to Lys* (stability) was made with the 
mutants obtained in the ura3-x mutant strains. 

b Stable refers to the ability to revert from Lys~ to Lys + . Stable Ura* Lys" 
isolates exhibited little (<10~ 6 ) or no reversion to Lys* at 20 or 30°C, while 
unstable Ura - Lys~ isolates exhibited a high frequency (10~ 5 to 10~ 6 ) of 
Lys" to Lys + reversion at both 20 and 30°C. 



5' ggaaaaacaacaattaatgtgtttg 

(-113) attaa - C 

ttaccggtgtcacaggatttctgggctcctacatccttgcagatttgttaggacgttctc 

(-51) aggacg - A 



< — LYS 2 I 1550 long repeat — > 

caaagaactacagtttcaaagtgtttgccca£at£AOOac.ctgactcttatacacaagta 
(-9) cgtcagggc - F 

{•7) tcagg - D 
(-6) gtcag * B 
(-7) tcag - B 



gcgtcctgaacggaacctttcccgttttccaggatctgatcttccatgtgacctcctaac 
(+26) tgaacg - F (+63) ccatgt - H 

(+28) aacgg - O 

atggtaacgttcatgataac Bglll URA3 Bglll 

gttatcatgaacgttaccatgttaggaggtcacatggaagatcagatcctggaaaacggg 
T - tgaacg ccatgt - B tcag - B aacgg - O 

(+93) (+84) (+59) (+46) 



<--- long repeat ISS0 I \LYS2 --> 

aaaggttccgttcaggacgctacttgtgtataagagtcagfiofcSAflflOficaaggatgaag 
(+27) aggacg - A (-1) cgrtcagggc - P 

(+29) tcagg - D (+-5) gtcag - B 



aagctgcatttgcaagattacaaaaggcaggtatcacctatggtacttggaacgaaaaat 



of Lys + only at 30°C, which was similar to previous results 
with Tn5 (16, 17). These Ura" mutants were shown by 
Southern analysis to be due to mutations in URA3 (among 16 
mutants in the ura3-x strain, all appeared to be point or, at 
most, small deletions, as determined by gel electrophoresis 
analysis) (data not shown). The second genetic category of 
Ura" Lys~ isolates exhibited a phenotype which was novel 
in that there was a high frequency (10 ~ 3 to 10 ~ 6 ) of Lys~ to 
Lys* reversion at both 20 and 30°C; because of this prop- 
erty, this category was referred to as unstable. A third 
genetic category of Ura" Lys" mutants consisted of isolates, 
designated as stable, that exhibited little (<10~ 6 ) or no 
reversion to Lys* at either temperature. On the basis of 
genetic and molecular analysis (described below), we deter- 
mined that these last two categories are due to imprecise 
excision. 

The spectrum of changes for the Pol* strain was different 
from that for the pol3-t strain in that nearly all the Ura" 
mutants (23 of 24) that exhibited the observed low rate of 
Lys* reversion carried mutations in the URA3 gene, as 
determined by Southern analysis of the mutants obtained in 
uraJ-A strain (data not shown). This is consistent with the 
observation that Tn5 excision is low in Pol* strains (16). The 
lower rates of URA3 mutation in Pol* versus pol3-t strains 
are consistent with the previously reported mutator pheno- 
type of the temperature-sensitive polymerase 8 mutants (17, 
19, 34). (In an analysis of spontaneous mutations arising in 
the pol3 mutant mut7-l, deletions [<70 bp] between small 
repeats accounted for nearly 20% of the mutants [34].) 
Ectopic gene conversion of the ura3-x point mutation cannot 
be excluded as a source of at least some of the Ura" 
mutants. The rates of appearance of Ura" mutants did not 
differ (see the footnotes for Table 1) for strains having a point 
mutation or a deletion at the normal location of URA3; 
therefore, it is unlikely that ectopic conversion is the major 
source of point mutations in URA3 between the LIRs. 

The high rate of appearance of the Ura" Lys - mutants in 
the pol3-t strains that were unstable (at 20 and 30°C) or stable 
for reversion to Lys* was due to imprecise excision. As 
determined by Southern blot and/or PCR analysis of the 



ttgcctcaaatattaaagttgtattaggcga 3 ' CENII 

(-90) atta* - C 

FIG. 2. Breakpoints of excision in the region of LYS2::Tn5 
(URA3). The Tn5{URA3) insertion into LYS2 (including the IS50 
long repeats) (Fig. 1) is located between the 9-bp repeats (18) 
(underlined sequence). Included in this diagram are 100 bp of each 
IS50 (2) and 100 bp of LYS2 (23) to the left and right of the insert. 
The orientation of LYS2 in chromosome II is based on unpublished 
results of Gordenin et al. (15a). Eight pairs of breakpoints (boldface) 
corresponding to imprecise excisions are designated A through H. 
Breakpoints C to H were sequenced in this work. Breakpoints A and 
B were identified previously and corresponded to in-frame imprecise 
excisions of Tn5-73 selected as Lys* (16). Precise excision break- 
points are identified by P. The nucleotide coordinates in the left and 
right parts of LYS2 start from the border of the repeats with IS50 and 
are given negative values beginning at the start of the repeat. IS50 
nucleotides are given positive values. 



three stable and nine unstable excisants, they retained small 
remnants from the original Tn5(URA3) that were less than 
300 bp. An additional stable excisant was due to a deletion 
that encompassed all of Tn5(URA3) plus 203 bp of the 
surrounding LYS2 sequence (excisant C; see Discussion). 
Therefore, all breakpoints are clustered near the ends of the 
4,150-bp Tn5(URA3) insert. Sequence analysis was carried 
out on 9 of the 13 excisants (4 stable and 5 unstable), and six 
categories of deletions based on breakpoints were identified 
(Fig. 2). All breakpoints have short (4- to 6-bp) repeats. 
There were three excisants in the ins-G category and two 
excisants in the ins-H category. These results agree with 
previous analyses (16) of in-frame imprecise (included in Fig. 
2) and precise Tn5 excision in 5. cerevisiae. All excisants 
independently isolated in this study* except excisant C, 
exhibited a polarity of breakpoints (see Discussion). 

Instability of quasipalindrome remnants. The unstable 
(Lys" to Lys*) excisants were caused by breakpoints within 
the IS50 repeats of the LIR (Fig. 1), leading to small 
quasipalindrome structures. Reversion to Lys* occurred 
primarily through precise excision between 9-bp direct re- 



5318 GORDENIN ET AL. 



Mol. Cell. Biol. 



peats at the base of Tn5. Previously, Gordenin et al. dem- 
onstrated that a selective medium containing alpha-amino- 
adipic acid (7) can be used to distinguish precise excision 
from in-frame imprecise excision of LYS2::Tn5-13 (16). 
Precise excision generates the wild-type gene, resulting in 
sensitivity to alpha-aminoadipic acid (Adp~), whereas cells 
containing the pseudo-wild-type gene may be capable of 
growth on this inhibitor (Adp + ) (16). On the basis of the low 
fraction of Adp + revertants, we conclude that reversion to 
Lys + of the unstable as well as stable excisants described in 
Table 2 was accomplished primarily by interactions between 
the direct repeats originally at the base of Tn5 (i.e., precise 
excision). Among revertants of the stable excisant E, a large 
fraction of the Adp + isolates appear to reflect another 
mechanism of reversion (possibly imprecise excision or 
suppression). 

The excision frequency of quasipalindromes appears to be 
related to the length of DNA between the inverted repeats, 
as demonstrated by excisants F and G. The lengths of the 
inverted repeat regions are comparable; however, the inter- 
vening loop sequences are 8 bp for the highly unstable 
structure, compared with 55 bp for the more stable structure. 
The unstable excisant H also contains a small loop sequence 
(9 bp) between inverted repeats. Since the number of 
quasipalindrome inserts examined is small, additional com- 
parable inserts need to be investigated to confirm the effect 
of the intervening loop on stability. Three other unstable 
remnants (reversion frequencies of about 10~ 5 ) that are 
approximately 300 bp long with inverted repeats greater than 
90 bp per repeat have been identified (data not shown). 

The unstable quasipalindrome structures isolated in the 
pol3-t strain are also unstable in the Pol + strain, although the 
reversion frequencies are an order of magnitude less. Unlike 
Tn5(URA3) and Tn5, they revert at a high frequency at the 
permissive temperature in the pol3-t mutant. 

Three rare reverting excisants (C, D, and E [Fig. 2 and 
Table 2]) do not contain any inverted repeats and are due to 
a deletion of one or both IS50 sequences of the LIR. The 
reversion frequencies are much lower in Pol* than in the 
pott-t strain. Possibly the revertants correspond to rare 
replication slippage, as indicated by the isolation of deletions 
between closely spaced small direct repeats in another pott 
mutant (34). If so, the present results indicate that the pol3-t 
mutation can have an effect on replication by polymerase 8, 
even at the permissive temperature (also see Table 1). 

URs create a hotspot for mitotic recombination. As we 
have shown, LIRs exhibit unique properties in that they can 
lead to intrachromosomal instabilities. The proposed repli- 
cation mechanism(s) (see reference 16 and Discussion) 
suggests that LIRs could also influence interchromosomal 
recombination. We therefore measured heteroallelic recom- 
bination (gene conversion) between the point mutation fys2-8 
(9, 26) located approximately 0.6 to 0.8 kb upstream (5') 
from Tn5 and various fys2 insertions. Among the latter were 
the LIR insertions Tn5-/5 and Tn5(URA3), as well as the 
24-bp (ins-D) and 55-bp (ins-E) pieces of the external end of 
IS50 and quasipalindromes ins-G and ins~H (Table 2 and Fig. 
2); all insertions were at the same position. We also studied 
mitotic recombination of fys2-8 with the point mutation 
tys2-lll located 92 bp from Tn5-7J in the direction of the 
fys2-8 (see Materials and Methods). Reversion rates of the 
heteroalleles are shown in Table 3. On the basis of the high 
reversion rates, we conclude that the major source of 
reversion is heteroallelic recombination. The frequency of 
Lys + for all heteroallelic diploids was 10 to 10 s times more 
than the reversion frequency of the individual fys2 mutations 



(Table 2; data for fys2-8 are not shown). The reversion rates 
of Tn5-75 in the homozygous pol3 and Pol + diploids did not 
differ significantly from the rates in the haploids (data not 
shown). 

In the pol3 mutant, recombination was generally higher at 
the semipermissive than at the permissive temperature for all 
pairs of heteroalleles tested, as expected from previous 
reports of the effects of polymerase mutations on recombi- 
nation (1) (Table 3). However, when one of the heteroalleles 
was the LIR insert, the recombination rate was 4- to 18-fold 
higher than that for the other heteroalleles in both the pol3 
and Pol + isogenic strains (at 20 as well as 30°C). The 
increase in recombination frequencies is accompanied by a 
high (up to sevenfold) bias in favor of loss of the Tn5 allele 
compared with the loss of fys2-8 (Table 4). The bias is 
primarily due to a preference in gene conversion of the Tn5 
allele. (See reference 8 for a discussion of genetic outcomes 
of gene conversion and reciprocal recombination in mitotic 
cells.) We conclude that the observed induction and pattern 
of mitotic homologous recombination are due specifically to 
the LIRs (see Discussion). 

DISCUSSION 

Inverted repeats are a frequent source of genetic instabil- 
ity in bacteria. Using TnJ as a source of LIRs and using 
newly developed derivatives that allow the detection of a 
broad range of changes, we have examined the conse- 
quences of various types of inverted repeats on genomic 
stability in yeast. Three types of genetic instabilities associ- 
ated with inverted repeats have been demonstrated: a high 
rate of large deletions in a DNA polymerase 8 mutant with 
the breakpoints clustering near the ends of LIRs, the stim- 
ulation of interchromosomal mitotic recombination in Pol + 
and pott mutants, and the frequent deletion of quasipalin- 
dromes. 

The increased frequency of deletions in a pott mutant and 
the clustering of breakpoints support the idea that altered 
replication of LIRs could be the source of deletions. An 
analysis of the breakpoints of the imprecise excision led us 
to conclude that the inverted repeats can cause excision via 
short repeats either internal or external to the LIRs. The 
regions that include all the sequenced breakpoints (i.e., 100 
bp on either side of Tn5 ends [Fig. 2]) contain 128 short (4- 
to 8-bp) homologous sequences between them. The much 
higher frequency of imprecise excision than of precise 
excision (20- to 30-fold in the pott-t strain) may be a 
reflection of the large number of small repeats available for 
imprecise excision. 

There is polarity of excision such that a breakpoint in one 
IS50 repeat of the LIR [portrayed as the right repeat of 
LYS2::Tti5(URA3) in Fig. 1] is always higher than the 
breakpoint in the other IS50 repeat (except for excisant C, in 
which the breakpoints are external, yet near the LIRs). 
Previous results with the imprecise excisants A and B (16) 
are consistent with the observed polarity, although there 
may be selection associated with these in-frame deletions. 
Further support for the polarity rule comes from a consid- 
eration of the orientation of short repeats within an LIR. 
That is, for any short sequence within one of the long repeats 
of the LIR, there is a corresponding short sequence of the 
opposite orientation in the other long repeat. For any pair of 
short direct repeats in an LIR, there is a corresponding 
symmetrical pair of short direct repeats. While there are two 
symmetrical opportunities for polarity of breakpoints, only 
one was observed. We propose that the polarity is a reflec- 
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TABLE 3. Influence of the Tn5 LIRs on mitotic 
homologous recombination 

Recombination rate (10 6 ) 



heteroaUeles 0 jw£ ^ 1 





20°C 


30°C 


20°C 


30°C 


Tn5-75/-8 


30 


406 


45 


64 


Tn5(URA3)/-S 


NT* 


NT 


56 


44 


ins-DI* 


4.5 


36 


3.1 


8.9 


ins-El* 


5.1 


50 


5.5 


8.4 


ins-GI* 


NT 


NT 


NT 


4.6 


ins-HI* 


NT 


NT 


NT 


9.0 


111/* 


4.0 


32 


NT 


16 



" The isogenic pol3-t and Pol*** strains containing the inserts Tn5-7J, 
Tn5{URA3\ ins-D, ins-E, ins-G y ins-H (see the legend to Fig. 1 and the 
footnotes to Table 2), or the point mutation tys2-lll (see Materials and 
Methods) were crossed with the POL+-DA strain {MATa fys2* his3- 15,11 
ura3 leu2 trpl-dell) or the pol3-t-DA strain (the conge nic pol3-t strain with the 
same auxotrophic markers). The Pol* strain was homozygous for the POL3 
gene. All resulting diploids were heteroallelic for the LYS2 mutations and, 
therefore, auxotrophic for lysine. The Lys + intragenic recombinants were 
selected on medium with no lysine. The rates of heteroallelic recombination 
were determined by fluctuation testing of 10 to 15 cultures. Independent 
frequency measurements (16, 17) confirmed the differences (data not shown). 

* NT, not tested. 



tion of the direction of replication of the lagging strand. The 
polarity rule for breakpoints is predicted by the model of Tn5 
excision via slipped mispairing in the replicating lagging 
strand (3, 16). (While polymerase 5 is proposed as a lagging- 
strand polymerase [24], we note that our model could apply 
even if the polymerase functioned both in leading- and 
lagging-strand replication.) According to the model, single- 
stranded regions arising during altered lagging-strand repli- 
cation of Tn5 can form a secondary stem-and-loop structure 
within the LIR, thereby delaying or blocking subsequent 
DNA synthesis. We propose that although lagging-strand 
DNA synthesis is generally not considered to involve strand 
displacement, the lagging-strand replication machinery may 
enter the duplex formed by the internal repeats (Fig. 1). 
However, as it replicates into the stem, the replicating end 
may be unstable because of this uncommon form of lagging- 
strand replication. As a result, opportunities to undergo 
replication on a single strand may be favored. Replication 
into the stem would result in one IS50 repeat of the LIR 
becoming duplex and the other becoming single stranded. A 
deletion event could occur by strand slippage from the 
duplex IS50 repeat of the LIR to a repeat in the single-strand 
region. Since slippage into a single-strand region (rather than 
into an already-paired region) is likely to occur, polarity in 
relation to an origin of replication is predicted. For the 



region studied, the present results suggest that the LYS2 
gene is replicated from the chromosomal origins(s) located 
on the 5' side of LYS2 (centromere distal). 

Polarity, along with the observations on imprecise exci- 
sion above, also argues against LIR deletions being strictly 
the consequence of altered replication (34) rather than the 
consequence of interaction of the replication machinery with 
the LIR. The influence of replication direction on palin- 
drome-stimulated deletions within a plasmid was recently 
demonstrated for Escherichia coli (31). 

The observations that altered replication could enhance 
recombination (1) and that altered replication of LIRs can 
yield deletions led us to investigate whether LIRs could also 
induce inter/chromosomal recombination. We found that 
LIRs create a hotspot for mitotic intragenic recombination in 
pol3-t and Pol* isogenic strains, in which mitotic gene 
conversion preferentially results in the loss of the LIR- 
containing insert (Tn5). While pol3-t mutants exhibit en- 
hanced recombination for all heteroallelic pairs, there is still 
a bias in favor of the LIR-containing insert. It remains to be 
established whether the bias will be seen with other mutants 
or agents that induce recombination. Several observations 
have led us to conclude that the enhanced recombination 
rates and conversion bias are specifically due to the LIRs. 
All comparisons were made between isogenic strains. All 
pairs of mutant alleles except one were at the same positions 
in the LYS2 gene. The exception (point mutation tys2-lll) 
was also located close to the site of Tn5 insertion. The 
enhanced recombination is not due to the sequences near the 
ends of the LIRs, judging by the lack of increase by small 
inserts corresponding to one or both ends of the LIRs. The 
enhanced recombination rates and the conversion bias can- 
not be attributed simply to a nonspecific effect of a large 
insertion. We and others (9, 32) have demonstrated with 
several heteroallelic pairs of mutations in the LYS2, URA3, 
HIS4 9 and HIS3 genes that a large insertion (as well as 
deletions) can lead to as much as a 19-fold bias in favor of 
loss of a point mutation; this is in contrast to the present 
findings involving the TnJ insertion. (The only exception 
was fys2-32 [9]; however, this mutation was subsequently 
shown to be due to a large Tyi insertion [8].) The absence of 
conversion bias for the heteroallelic pairs of point mutations 
that do not exhibit hotspots is well documented (28). A 
conversion bias involving a hotspot has been reported for 
mitotic gene conversion of the HOT1 allele of yeast (33). 

In terms of the model described above, these results 
suggest that homologous recombination may be an alterna- 
tive to replication slippage. Possibly, the growing 3' end, 
when stalled at the base of a LIR secondary structure (Fig. 
1), can interact with another homologous sequence, resulting 
in gene conversion. This is reminiscent of the mechanism 



TABLE 4. Genotypes of Lys + recombinants resulting from recombination between the heteroaUeles with or without LIRs° 



No. of recombinants 



pol3-t 

fys2 heteroaUeles — p„l+ 

20°C 30°C 



+/8 +/N +/8N +/8 +/N +/8N +/8 +/N +/8N 



Tn5-7J/-8 106 23 1 103 15 1 103 31 7 

ins-El* NT NT NT 55 44 5 100 98 7 

Ull* 67 78 1 40 39 16 52 61 3 



a Genotypes of the Lys* recombinants were determined as described in Materials and Methods and reference 9. N, insertion of Ta5-75, ins-E alleles, or point 
mutation lys2-lll; 8, tys2-8 allele; 8N indicates the presence of both alleles in the mutant fys2 gene, nt, not tested. 
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proposed for DNA synthesis past a lesion in phage T4 (15). 
If this model is correct, the LIRs themselves could lead to 
ectopic recombination with another copy of the same repeat 
elsewhere in the genome when the 3' end is stalled after 
entering the paired region (Fig. 1). This possibility is cur- 
rently under investigation. As an alternative explanation for 
the observed recombinogenesis of LIRs, it is also possible 
that the repeats within a LIR have an increased probability 
of recombinational interaction; this in turn could result in a 
greater likelihood of interchromosomal recombination. The 
absence of stimulated recombination by quasipalindromes 
ins-G and ins-H could reflect a greater likelihood of bypass- 
ing the secondary structure formed by the short quasipalin- 
drome in the course of replication than of bypassing the LIR 
secondary structure. This is consistent with the higher 
deletion frequencies of the quasipalindromes. 

Palindromes and the unstable quasipalindromes described 
here are a special class of inverted repeats. High frequencies 
of palindrome excision have been observed in bacteria 
(references 10, 11, and 14 and references therein) and 
recently in S. cerevisiae (20, 29). Unlike Tn5 and Tn5 
(URA3), the unstable quasipalindrome structures isolated in 
the pol3-t strain are unstable in the Pol* strain, and they 
revert at a high frequency at the permissive temperature in 
the pol3-t mutant. Possibly because of their small size, they 
can lead to secondary structures during normal replication, 
whereas intrastrand duplexes may occur between the LIRs 
of Tn5 only in mutants with altered replication (Fig. 1) (16). 
Small palindromes in bacterial plasmids have recently been 
demonstrated to generate stem-and-loop fragments physi- 
cally separate from the parental plasmid, suggesting tem- 
plate switching during replication as a source of excision 
(27). If unstable palindromes in eukaiyotes have the same 
properties, such a mechanism could be a source of random, 
albeit rare, insertions into the DNAs of higher organisms. 

These results describing the multiple effects of inverted 
repeats on genome stability suggest that LIRs may be an 
important source of genetic change in genomes that contain 
many repeats. For example, within the human genome there 
are on the order of a million repeats that are greater than a 
few hundred base pairs (12, 21). Assuming randomness of 
orientation, the number of LIRs is considerable and the 
distances between them are comparable to those described 
in the present report. If the opportunities for interactions 
extend over large distances, the number of possible interac- 
tions becomes immense. Such interactions may explain the 
instability of human DNAs in yeasts (22a, 25). Given that 
there are comparable DNA polymerases in eukaryotes (6, 
35), we propose that the human genome is at considerable 
risk for rearrangements due to deletions between LIRs or 
through LIR-stimulated recombination. The extent of LIR- 
induced changes is expected to relate to the level of homol- 
ogy between repeats. The consequences of DNA divergence 
are currently being investigated. 
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Despite recent technical improvements, the construction of recombinant adenovirus vectors remains a 
time-consuming procedure which requires extensive manipulations of the viral genome in both Escherichia coli 
and eukaryotic cells. This report describes a novel system based on the cloning and manipulation of the 
full-length adenovirus genome as a stable plasmid in E. coli, by using the bacterial homologous recombination 
machinery. The efficiency and flexibility of the method are illustrated by the cloning of the wild-type adenovirus 
type 5 genome, the insertion of a constitutive promoter upstream from the E3 region, the replacement of the 
El region by an exogenous expression cassette, and the deletion of the £1 region. All recombinant viral DNAs 
were shown to be fully infectious in permissive cells, and the modified £3 region or the inserted foreign gene 
was correctly expressed in the infected cells. 



Adenoviruses are generally associated with benign patholo- 
gies in humans and are characterized by a number of features 
which make them particularly attractive as gene transfer vec- 
tors for gene therapy or immunization purposes (7, 15, 21, 35). 
Most vectors are based on the adenovirus type 5 (Ad5) back- 
bone in which an expression cassette containing the foreign 
gene has been introduced in place of the early region 1 (El) or 
early region 3 (E3). Viruses in which El has been deleted are 
defective for replication and are propagated in the human 293 
complementation cells, providing the El products in trans (16). 
Methods to construct such recombinant adenoviruses are well 
documented (3, 14). They usually consist of a first step of 
subcloning of the exogenous expression cassette into a segment 
of the viral genome. The recombinant vector is then produced 
by the reconstitution of a complete viral DNA molecule 
through ligation in vitro between the segment and the viral 
genome, followed by transfection into permissive cells. Alter- 
natively, cotransfection into the complementation cells of the 
viral genome and plasmid DNA molecules can generate the 
recombinant viruses by homologous recombination in vivo. 
These methods frequently generate a background of nonre- 
combinant viruses, and despite recent improvements (23), re- 
peated screening of many plaques is sometimes required in 
order to isolate pure recombinant vectors. 

Two alternative procedures in which no parental infectious 
viral genome is used have been described (5, 24, 29). One 
method is based on the use of overlapping Ad5 DNA se- 
quences cloned in two bacterial plasmids (5). The first plasmid 
carries the total Ad5 genome with a deletion of the DNA 
packaging signal and part of the El region, with the left and 
right inverted terminal repeat (ITR) sequences directly co- 
valently joined. The second plasmid contains the left-end viral 
sequences, including the packaging signal and a passenger 
gene in place of the El region. Cotransfection of these plas- 
mids in 293 cells results in the production of infectious recom- 
binant vectors by in vivo homologous recombination. The con- 
figuration of the first plasmid is however known to be unstable 
in Escherichia coli (13, 28, 31). Moreover, the introduction of 
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specific mutations or deletions in regions other than El re- 
quires tedious preliminary cloning steps in E. coli. The second 
alternative method is based on the manipulation of the full- 
length Ad genome as an infectious yeast artificial chromosome 
(YAC) (24). Targeted modifications of the viral genome are 
introduced by homologous recombination in yeast cells, and 
infectious virions are generated after transfection of the ade- 
novirus genome, excised from the YAC vector, into appropri- 
ate cells. Although powerful, this method requires the use of 
an additional host (yeast) in which DNA yields are relatively 
low. 

In the present study, we describe a novel procedure for the 
generation of recombinant adenovirus vectors that takes ad- 
vantage of the highly efficient homologous recombination ma- 
chinery of E. coli (6, 10, 11, 25, 30). We applied this method to 
clone the full-length Ad5 genome in one unique and stable 
infectious bacterial plasmid and to modify specifically several 
viral genetic regions. The production of vectors constitutively 
expressing the E3 region and/or bearing a human coagulation 
factor IX (hFIX) expression cassette in place of El is described 
in this paper as an example of the flexibility of this method. 

The full-length 36-kb Ad5 genome was first cloned in a 2-kb 
plasmid by homologous recombination in E. coli between Ad5 
virion DNA and a linear form of the pTG3601 plasmid (Fig. 
1A; Table 1). pTG3601 was derived from the ppolyll plasmid 
(27) by insertion of 935- and 853-bp Ad5 DNA fragments 
amplified by PCR from the left and right ends of the Ad5 
genome, respectively. Cotransformation of the Bg/II-linearized 
pTG3601 plasmid and Ad5 genomic DNA into E. coli BJ5183 
recBCsbcBC (18) regenerated a stable circular pTG3602 plas- 
mid containing the total Ad5 genome and conferring ampicillin 
resistance to the bacterium (Fig. 1). The frequency of recom- 
binants was high, as estimated by the ratio between the number 
of ampicillin-resistant colonies obtained after cotransforma- 
tion of Bg/II-digested pTG3601 and Ad5 DNA and the number 
of colonies obtained after transformation of the linearized 
pTG3601 plasmid alone. Depending on the experimental con- 
ditions, this ratio was between 5 and 61 (Table 1). This ratio 
indicates that the linearization of the plasmid prevents the 
growth of background colonies containing the parental vector. 
The efficiency of recombination could, however, not be signif- 
icantly improved by altering the insert (I):vector (V) ratio 
(Table 1). One explanation might be that the size of the Ad5 
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FIG. 1. Recombinational cloning of infectious full-length Ad5 genome in E. coli. (A) E. coli BJ5183 was cotransformed with Ad5 genomic DNA and a 
ifc/II-linearized pTG3601 plasmid containing 935 and 853 bp from the left and right ends of Ad5, respectively. In vivo homologous recombination generates 
ampicillin-resistant colonies containing the 38-kb pTG3602 plasmid carrying the full-length Ad5 genome. Large-scale preparation of pTG3602 DNA was carried out 
in the C600 £. coli strain. Cloned Ad5 DNA is still fully infectious as shown by the development of virus plaques after transfection of permissive human cells with 
Pad-restricted pTG3602 DNA. (B) Restriction endonuclease analysis of pTG3602 plasmid DNA (lanes I), DNA purified from AdTG3602 virus produced by 
transfection of Pad-restricted pTG3602 DNA (lanes II), and wild-type Ad5 DNA (lanes III). Plasmid DNA (1 |ig), 10 uj of viral DNA prepared according to the 
method of Hirt (20), or 1 u,g of purified Ad5 DNA was digested with/i/ill (A), EcoRV (E), Hindlll (H), Pvul (P), and Sphl (S). Carets mark the positions of bands 
corresponding to the plasmid-Ad5 junctions. Size markers are indicated on the left. 



genome (36 kb) constitutes a limiting factor for the efficiency 
of transformation. This possibility is supported by the obser- 
vation that E. coli transformation is inhibited when increased 
amounts of Ad5 DNA are used (Table 1). 

Eight bacterial colonies randomly isolated for further char- 
acterization were all shown by DNA restriction analysis to 
contain the expected plasmid (Table 1), suggesting that each 
recombination event occurred through homologous pairing of 
free viral DNA ends. However, plasmid DNA yields were low 
in BJ5183 cells, probably because of the formation of plasmid 
multimers (26). pTG3602 DNA was therefore retransformed 



into the C600 bacterial strain (22) to isolate larger quantities of 
plasmid DNA. Extensive characterization of pTG3602 DNA 
by restriction endonuclease analysis failed to detect any visible 
DNA rearrangement (Fig. IB). The stability of the 38-kb plas- 
mid is presumably due to the insertion of the 2-kb ppolyll 
sequence between the two viral ITRs, avoiding the constitution 
of palindromic sequences (28, 31). 

Two unique Pad sites were introduced by PCR immediately 
upstream of the left ITR and downstream of the right ITR in 
the initial pTG3601 plasmid (Fig. 1A). Since Pad is absent in 
Ad5 genomic DNA, Pad digestion allows the precise excision 
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TABLE 1. Efficiency of recombinational cloning of full-length 
Ad5 genome 



Vector 
pTG3601-£fc/II 
(ng) 


Insert Ad5 
DNA (ng) 


I:V molar 
ratio" 


No. of Amp r 
colonies 


HR/V* 


1 


0 




4 






9.4 


1 


245 


61 




94 


10 


136 


34 


10 


0 




64 






94 


1 


627 


10 




940 


10 


321 


5 



a I, insert; V, vector. 

b The efficiency of homologous recombination (HR) is indicated by the HR/V 
ratio, calculated as the ratio of the number of colonies generated by the cotrans- 
fected vector and Ad5 insert DNA and the number of colonies generated by the 
vector alone. 



of the full-length Ad5 genome from the pTG3602 plasmid. The 
infectivity of this pTG3602-derived Ad5 genome was demon- 
strated by calcium phosphate transfection (17) of 293 and 
A549 cells (34): large numbers of plaques were observed in 
cells transfected with the /Vcl-restricted pTG3602 DNA, while 
the closed circular plasmid was unable to generate any viral 
plaques, confirming that at least one ITR extremity has to be in 
a free configuration to allow efficient adenovirus DNA repli- 
cation (Table 2) (4, 19). These results show that the plasmid- 
derived Ad5 genome is fully infectious, with a specific infec- 
tivity of about 1/20 to 1/40 of that of Ad5 genomic DNA 
purified from wild-type virus particles (Table 2). Progeny virus 
recovered from independent pTG3602 plaques was amplified 
on 293 cells and further analyzed for growth characteristics, 
virus production yields, and DNA restriction patterns. In all 



TABLE 2. Specific infectivity of cloned wild-type and recombinant 
Ad5 genomes 8 



DNA 


DNA 


No. of plaques/6-cm dish 


concn 


293 cells 


A549 cells 


Ad5 


0.1 


>50, >50 


4,5 




1 


Lysed, lysed 


23, 34 


pTG3602 


1 


0,0 


0,0 




5 


0,0 


0,0 


pTG3602-/>ad 


1 


23, 33 


1,0 




5 


Lysed, lysed 


6, 10 


pTG3604-/>acI 


1 


53 


1 




5 


Lysed 


13 


pTG3614-PacI 


1 


20,41 


ND 




5 


Lysed, lysed 


ND 


pTG3606-/>ad 


1 


52, 27 


ND 




5 


Lysed, lysed 


ND 


pTG3622-A*d 


1 


17,9 


ND 




5 


63, 60 


ND 


pTG3623-/>acI 


1 


17,17 


ND 




5 


34, 22 


ND 



8 A549 and/or 293 cells were transfected with 0.1 and 1 v% of Ad5 DNA or 
nonrestricted or fad-restricted plasmid DNA. Cells were overlaid with agar 15 
h later, and plaques were then counted 14 days posttransfection. ND, not done 
(these viruses have El deletions and do not grow on A549 cells). Results from 
two independent experiments are shown (except for those for pTG3604-/*acI). 
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cases, pTG3602-derived adenovirus (AdTG3602) was indistin- 
guishable from wild-type Ad5 (Fig. IB). 

Since construction of recombinant adenovirus vectors is usu- 
ally a time-consuming procedure, we then tested whether a 
single-step replacement strategy exploiting the E. coli homol- 
ogous recombination machinery could be designed to modify 
selectively any particular genetic region of Ad5 (Fig. 2). The 
viral region to be modified is first subcloned into a bacterial 
plasmid and the desired deletions, insertions, or mutations are 
performed by conventional molecular biology techniques (32). 
The modified segment is then purified and cotransformed into 
BJ5183 together with the appropriately restricted plasmid con- 
taining the full-length Ad5 genome (Fig. 2). Homologous re- 
combination events between the free ends of the modified 
replacement fragment and their homologous viral sequences in 
the linearized vector eventually generate plasmids containing 
Ad5 genomes with the required modifications. 

As an example, we derived from pTG3602 a new plasmid 
(pTG3604) containing a full-length Ad5 genome in which ex- 
pression of E3 (36) was relieved from the control by El pro- 
teins and made constitutive by the introduction of the Rous 
sarcoma virus 3' long terminal repeat promoter sequences 
(LTR RSV ). LTR RSV was inserted in front of E3 at nucleotide 
(nt) 28249 (throughout the manuscript, nucleotide numbers 
refer to positions on the Ad5 genome, according to reference 
8) in a transfer plasmid (pTG1696) bearing an Ad5 segment 
from nt 21562 to the right-end ITR. This insertion site was 
chosen in order to maintain the integrity of the L4 mRNA 
polyadenylation signal which overlaps the E3 mRNA transcrip- 
tion initiation site (33). A fig/I (nt ^"s*") fra g ment encom . 
passing the modified E3 region was then isolated from 
pTG1696 and recombined in E. coli with pTG3602 previously 
linearized at nt position 27082 by Spel digestion (Fig. 2A). 
Spel cuts the Ad5 genome upstream of the targeted E3 region 
and leaves three regions of DNA homology with the pTG1696 
Bgtl fragment: (i) 2,220 bp between the flgfl (nt 2 * 862 > and 
Spel^ ™ 82 > sites, (ii) 1,167 bp between Spef nt 27082 > and the 
5' boundary (nt 28249) of LTR RSV , and (iii) 7,623 bp between 
the 3' boundary (nt 27588) of LTR RSV and the flg/I (nt 35211 > 
site. As a consequence, homologous recombination should 
generate two types of plasmids (Fig. 2A): (i) the parental 
pTG3602 plasmid reconstituted by recombination between se- 
quences upstream from the Spel site and sequences located 
between Spel and the E3 region and (ii) the expected pTG3604 
plasmid containing the E3-modified genome produced by re- 
combination between sequences upstream from Spel and 
downstream from the E3 region. The radioactive screening of 
the ampicillin-resistant colonies with an LTR RSV oligonucleo- 
tide probe confirmed this hypothesis since 18.5% of the 270 
colonies were found to contain the pTG3604 plasmid (Table 
3). DNA analysis performed on six randomly selected candi- 
dates confirmed that the cloned viral genome was genetically 
stable and full-length and contained a modified E3 region. 
Moreover, homologous recombination in BJ5183 was highly 
efficient since 270 ampicillin-resistant colonies were obtained 
after cotransformation with the linearized pTG3602 plasmid 
and the pTG1696 fragment, while only 2 colonies appeared 
with the linearized pTG3602 vector alone (Table 3). 

Similar to E3, the El region can also be efficiently modified 
by homologous recombination in E. coli. As an example (Fig. 
2B and C), the El region of Ad5 was either deleted or replaced 
by an expression cassette encoding hFIX (2). Replacement of 
El by hFTX was done by cotransformation of E. coli BJ5183 with 
a pTG3602 or pTG3604 plasmid linearized in El by CM (nt 918) 
digestion and an Mscl DNA fragment bearing the hFIX cDNA 
under the control of the mouse phosphoglycerate kinase gene 
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FIG. 2. Construction of recombinant adenovirus vectors by homologous recombination in £. coli. Several strategies of Ad5 genome manipulation are presented: 
insertion of a new regulatory sequence (LTR RSV promoter) upstream of the Ad5 E3 region (A), replacement of the Ad5 El region by a foreign sequence (expression 
cassette for human coagulation factor IX) (B), and deletion of the Ad5 El region (C). x, homologous recombination events required for the targeted modification of 
the viral DNA; X, an homologous recombination event that reconstitutes the parental target vector (panel A). ITRs (closed boxes), Ad5 genomic DNA (open boxes), 
foreign sequences (stippled boxes), El deletion (thick line), plasmid sequences (thin lines), p-lactamase gene (arrows), and plasmid excision (labeled Pad) and 
linearization (labeled BgUl, C/al, and Spel) restriction sites are also indicated. 



promoter (1) and the simian virus 40 polyadenylation signal 
(Fig. 2B). The Mscl insert was isolated from a transfer plasmid 
(pTG4378) containing the hFIX expression cassette in place of 
the El region (nt 459 to 3328), in an adenovirus DNA frag- 
ment corresponding to 17% of the left genomic sequence (nt 1 
to 6241). Excision by Mscl leaves upstream and downstream 
from the hFIX cassette 188 and 2,044 bp, respectively, of Ad5 
sequences homologous to viral DNA in pTG3602 and 
pTG3604 (Fig. 2B). Cotransformation into BJ5183 generated 
only few ampicillin-resistant colonies (Table 3), probably as a 
consequence of the short length (188 bp) of sequence with 
DNA homology upstream of the Clal site (see below). How- 
ever, linearization of target vectors in the El region signifi- 
cantly increased the frequency of colonies containing plasmids 



TABLE 3. Efficiency of targeted modifications of the cloned 
Ad5 genome a 



Insert 


Vector 


HR/V* 


No. of positive 
colonies/total no. of 
colonies (plasmid) 


pTG1696-5g/I 


pTG3602-S/?eI 


270/2 


50/270 (pTG3604) 


pTG4378-Msd 


P TG3602-C/aI 
pTG3604-C/aI 


2/0 
17/19 


1/2 (pTG3614) 
4/17 (pTG3606) 


pTG9530-£>«/I 


pTG3602-C/aI 
pTG3604-C/aI 


101/1 
569/48 


6/6 (pTG3622) 
6/6 (pTG3623) 



* Plasmids carrying wild-type (pTG3602) or E3-modified (AdTG3604) Ad5 
genomes were digested by Spel for E3 region targeting or by Clal for El region 
targeting and transformed into E. coli BJ5183 either alone or together with a 
10-fold molar excess of the purified replacement insert. Positive colonies con- 
taining plasmids with the modified full-length Ad5 genomes were identified by 
radioactive screening or with a plasmid minipreparation. 

^Number of colonies generated by the cotransfected vector and insert per the 
number of colonies generated by the vector alone. 



in which El was replaced by the hFIX expression cassette: 1 
positive colony of 2 and 4 positive colonies of 17 identified for 
the pTG3602- and pTG3604-derived vectors, respectively (Ta- 
ble 3). The negative colonies contained the background 
pTG3602 and pTG3604 parental vectors, probably correspond- 
ing to residual uncut vectors. 

We tested whether increasing the length of the region with 
DNA homology upstream of the El-replacement cassette 
might improve the homologous recombination efficiency (30). 
We used a transfer plasmid (pTG9530) containing an Ad5 
left-end region (nt 1 to 5788) deleted from part of the El 
sequences (nt 459 to 3328) in the same ppolyll plasmid back- 
bone as pTG3602. A Drdl fragment isolated from pTG9530 
was cointroduced into bacteria along with the CM-restricted 
pTG3602 or pTG3604 target vector (Fig. 2C). The sequence of 
DNA homology upstream of the Clal site was 2,244 bp long 
(compared with 188 bp in the previous experiment), and this 
increase in length dramatically affected the HR/V ratio, which 
increased to 101/1 and 569/48 for the pTG3602- and pTG3604- 
derived vectors, respectively (Table 3). Analysis of six ran- 
domly picked colonies showed the presence of the expected 
pTG3622 (full-length Ad5 deleted from El) and pTG3623 
(full-length Ad5 deleted from El and with a constitutive E3 
region) recombinant plasmids. 

These experiments demonstrate that the replacement or the 
deletion of regulatory or structural viral genes can be easily 
achieved by homologous recombination in E. coli. The advan- 
tages of using E. coli instead of yeast cells (24) include higher 
transfection efficiency, higher growth rates, and higher plasmid 
yields. Moreover, theis. coli single-step replacement strategy is 
technically more straightforward than the two-steps replace- 
ment strategy classically used to target a segment of yeast 
DNA. We observed that the frequency of homologous recom- 
bination events and the efficiency of recovery of the expected 
recombinants could vary depending on the design of the DNA 
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transfer strategy. In the described examples (Fig. 2B and C; 
Table 3), the frequency of homologous recombination could be 
enhanced 10- to 100-fold by increasing the length of the region 
with DNA homology. Moreover, the yield of positive recom- 
binants was higher when the vector was linearized precisely in 
the targeted region, as in the El replacement experiments (Fig. 
2B and C). Modification of the E3 region was done with a 
target vector linearized at an Spel site located 1,167 bp up- 
stream from E3. As a consequence, two types of plasm ids were 
obtained: the parental pTG3602 vector and the E3-modified 
Ad5 genome (Fig. 2A). The same phenomenon was observed 
in deletion experiments targeting the fiber gene or the E4 
region which is located 4.8 or 5.8 kb downstream of the Spel 
site, respectively: the frequency of correct recombination 
events was found to be inversely proportional to the distance 
between the targeted region and the Spel site (unpublished 
observations). The percentage of positive colonies was never- 
theless always >1%, and the identification of these colonies by 
classical screening techniques was straightforward. 

Transfection of ^cl-restricted pTG3604, pTG3606, pTG3614, 
pTG3622, and pTG3623 plasmid DNA into 293 cells allowed 
the production of viral plaques with efficiencies similar to that 
obtained with pTG3602 (Table 2). In all cases, transfection of 
as little as 1 jig of viral DNA was sufficient to generate enough 
viral plaques for virus propagation and purification; the spe- 
cific infectivity was around 30 plaques per u,g of linearized 
adenovirus DNA in 293 cells (Table 2). As a comparison, 
adenovirus DNA excised from a YAC was shown to generate 
2 to 10 plaques per u,g of total yeast DNA, corresponding to a 
calculated specific infectivity of 100 to 500 plaques per u,g of 
viral DNA (24). The method recently described by Bett et al. 
(5) and McGrory et al. (29) generates 4 to 8 plaques per jxg of 
transfected plasmid DNA. In the latter case, the transfer plas- 
mid is cotransfected with a circular plasmid bearing the Ad5 
genome, which is in a relatively unstable configuration and is 
less infectious than linear DNA (12, 13). The procedures de- 
scribed here and by Ketner et al. (24) are based on prior 
manipulation of the full-length viral genome contained in sin- 
gle molecules of stable YACs or plasmids. Transfection into 
permissive eukaryotic cells of Ad5 genome excised from YAC 
or plasmid sequences allows the systematic recovery of pure 
virus plaques. Identification of the positive recombinant vi- 
ruses by plaque screening and isolation of pure virus stocks by 
repeated plaque purifications are therefore not required for 
these methods. As a consequence, the construction of recom- 
binant vectors by the described procedure is usually faster than 
that by previous methods; as an example, production of pure 
plaques of a vector in which El is deleted with pTG3602 and 
the El-transfer plasmid as starting materials requires at most 
20 days compared with at least 6 weeks for methods in which 
similar starting materials (Ad5 DNA and El-transfer plasmid) 
are cotransfected in 293 cells (3, 14). 

The analysis of the phenotype of all plasmid-derived viruses 
confirmed that the selected modifications were correctly and 
stably introduced into the viral genomes. As an illustration, we 
have shown that expression of E3 in AdTG3606 is no longer 
controlled by El but is driven by the constitutive LTRrsv 
promoter sequences inserted upstream from the E3 coding 
region (Fig. 3A). 293 and A549 cells were infected with wild- 
type Ad5 or El-deleted AdTG3606 or AdTG3614 viruses at 
multiplicities of infection of 10 (data not shown) and 100 (Fig. 
3A). Radioimmunoprecipitation of the cell extracts with 
monoclonal antibody Twl.3 (9) directed against the E3-en- 
coded gpl9K protein showed an expression of the viral glyco- 
protein in both cell lines infected with either Ad5 or 
AdTG3606. In contrast, infection with a virus with an El de- 




AdTG3606 AdTGttM 



FIG. 3. Foreign sequences introduced into recombinant viruses by in vivo 
homologous recombination are functional. (A) Insertion of the LTRrsv pro- 
moter sequences upstream of the Ad5 E3 region confers constitutive expression 
to gpl9K. 293 and A549 cells were mock infected or infected for 24 h with 
AdTG3606, AdTG3614, or wild-type Ad5. Total proteins labeled with [ 35 S]me- 
thionine/cysteine were immunoprecipitated with MAb Twl.3, an anti-gpl9K 
monoclonal antibody, and analyzed by sodium dodecyl sulfate-polyacryl amide 
gel electrophoresis. (B) purified AdTG3606 or AdTG3614 recombinant vector in 
which the El region was replaced by an expression cassette for hFIX was injected 
into the tail vein of three C57BL/6 mice. Serum was recovered at 3 days postin- 
fection and analyzed by enzyme-linked immunosorbent assay for the plasmatic 
concentration in human factor IX. The means of data for each set of animals is 
shown. This first lane ( - - ) shows the results for a control, untreated mouse. In 
panel A, size markers (in kilodaltons [kD]) are indicated on the left. 



letion and carrying wild-type E3 sequences (AdTG3614) led to 
gpl9K expression only in the El-expressing 293 cells (Fig. 3A). 
In addition, the intravenous administration of AdTG3606 or 
AdTG3614 containing an hFIX expression cassette in place of 
El led to high concentrations of hFIX in C57BL/6 mice: 3 days 
after virus injection, mean levels of 300 and 650 ng/ml were 
detected in the sera of the animals treated with AdTG3606 and 
AdTG3614, respectively (Fig. 3B). 

The procedure described in this report allows the rapid and 
efficient cloning and manipulation of full-length infectious ad- 
enovirus genomes in bacterial plasmids. This method combines 
the powerful genetic engineering techniques that are available 
in E. coli and the ability of this microorganism to recombine 
homologous sequences at a high frequency (6, 10, 11, 25, 30). 
The advantages of this technology are multiple: (i) all cloning 
and, more importantly, recombination steps are carried out in 
E. coli, (ii) the frequency of bacterial colonies containing the 
plasmid with the modified adenovirus genome is very high, (iii) 
any genetic region of the viral genome can be specifically 
modified or deleted if appropriate restriction sites are avail- 
able, (iv) plasmids containing the full-length adenovirus ge- 
nomes can be introduced into appropriate bacterial strains for 
production of large amounts of viral DNA, and (v) transfection 



4810 NOTES 



J. Virol. 



of excised recombinant adenovirus DNA into permissive hu- 
man cells generates plaques containing only pure virus parti- 
cles. We demonstrate that full-length recombinant adenovirus 
genomes modified in both El and E3 regions can be efficiently 
generated and that their rescue as pure viral particles was 
guaranteed in transfected 293 cells. We similarly produced 
adenoviruses with either deletions of or modifications in the 
fiber gene, the E2A gene, or the E4 region (unpublished data). 
The production of even-further-crippled viral vectors is theo- 
retically possible with the same technology, provided that ad- 
equate restriction sites are available for Ad5 DNA lineariza- 
tion. The presence of the Clal, BamRl, and Spel sites located 
at positions 918, 21562, and 27082, respectively, already allows 
efficient modifications of most of the viral genes. However, 
improvement of this approach is possible by the introduction 
of new unique sites at other locations. Investigation of these 
sites is currently in progress. 
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Abstract 

The genomes of all organisms contain an abundance of DNA repeats which are at-risk for causing 8**°*"^ 
havSt SaccLomyces cereuisioe to investigate various repeat ^^^J^i^^SX 
for causing genomic instability and the role of DNA metaboUsm factors. Several types of ****** J™™** * 
he likeliho^ of genetic changes such as mutation or recombination ^I^^^^^^^Z 
replication or repair. Specifically, we have investigated inverted repea*. ^ucleou^^ ^^we haltered are 
the consequences of various DNA metabolism mutants. Because the at-nsk motif "^J^Siity M weU 

sensitive todicators. we have found that they are useful tools to reveal new genenc ***** * 

as to distinguish subtle differences between alleles. © 1998 Elsevier Science B.V. All rights reserved. 
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1. Introduction 

The balance between genome integrity and ge- 
netic instability is determined by a vast number of 
factors, most of which have been addressed at one 
time or another in the previous 399 volumes of this 
Journal. In recent years, it has become apparent that 
repeated DNA sequences can have a high potential 
for mutation and recombination which is greatly 



Abbreviations: ARMs, at-risk motifs; MMR, mismatch repair, 
DNA-Pol, DNA polymerase; DSB, double-strand break; HR, in- 
verted repeat 
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influenced by repair or replication capabilities. Since 
the likelihood of repeat induced genetic change usu- 
ally depends on arrangement rather than specific 
sequence, we have referred to these motifs as at-risk 
motifs or ARMs [1]. Motifs that can lead to non- 
canonical DNA structures [2] are a particularly inter- 
esting category of ARMs. For example, they might 
lead to novel structures during replication that are 
refractory to error correction or they might stimulate 
further chromosomal changes such as recombination. 

We have used the yeast Saccharomyces cere- 
visiae to identify a variety of ARMs that can give 
rise to non-canonical DNA structures and we have 
investigated the genetic controls of ARM-associated 
genomic instability. Because of their high potential 
for change, ARMs represent important threats to 
genome stability. Since these ARMs have human 
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counterparts and there are many examples of human 
DNA metabolic gene homologues, the systems are 
also expected to provide an understanding of threats 
from ARMs in humans. 



2. Inverted repeats 

Large direct repeats have long been known to be 
capable of undergoing recombination. However, in- 
verted repeats (IRs) exhibit novel genome destabiliz- 
ing features, some of which were initially discovered 
by Collins [3] and Collins et al. [4]. IRs were the first 
members of an ARMs category whose effects are 
attributed to the generation of non-canonical DNA 
structures. Collins et al. showed that head-to-head 
IRs (palindromes) were deleted at extremely high 
rates in Escherichia colL Palindromes comprised of 
at least 485 bp repeats were unable to propagate in 
bacterial cells; only partial or complete deletions 
could be recovered. The proposed mechanism of 
IR-stimulated deletions includes the generation of a 
hairpin structure in single strand DNA (ssDNA) that 
arise during replication as described in Fig. 1 (for 
reviews, see Refs, [5,6]). The non-canonical DNA 
hairpin structure would impair elongation of the 
nascent strand because the DNA polymerase (DNA- 
Pol) is stalled at the base of the hairpin. 

Thus, the IR group of ARMs appear to exert their 
effect by forming structures that interfere with nor- 
mal DNA metabolism. The observed deletions could 
arise by several processes. The hairpin could be 
cleaved by a structure-specific nuclease and the sur- 
rounding DNA could be end-joined [5-8]. Alterna- 
tively, the hairpin could lead to replication slippage 
between fortuitous short repeats that might be pre- 
sent near the base of the stem as shown in Fig. 1 
[3,9-11]. 

Subsequent studies based in yeast indicate that 
IRs can stimulate deletions in eukaryotes via replica- 
tion mechanisms. In these studies, which took advan- 
tage of the replicative DNA polymerases (DNA-Pol) 
a and 8 mutants, the deletion likelihood was in- 
creased in the replication defective strains [12-14]. 
The deletion rates of various IRs, such as the 1.5 kb 
IRs of bacterial transposon Tn5 separated by a long 
(2.7 kb) spacer, short palindromes or quasipalin- 
dromes (IRs separated by only 8-9 bp), were in- 



spacer 




Fig, 1. Replication model for IR generated genomic rearrange- 
ments. This model is based on results with model IRs inserted in 
the LYS2 gene of yeast [5-7], IR-insert flanked by short direct 
repeats (arrows inside black boxes) inactivates the gene LYS2. 
During replication of the IR sequence (long arrows), single- 
stranded regions of DNA may give rise to a stem-like secondary 
structure, thereby inhibiting extension of the 3'-end and possibly 
leading to deletion or recombination. Deletion of the IR (precise 
excision leading to Lys + phenotype) could result from replication 
slippage between short direct repeats at the base of the secondary 
structure. Alternatively, the stalled replication could initiate 
recombination with a homologous LYS2 sequence. (In our con- 
structs, this LYS2 sequence contains a mutation; thus, wild type 
recombinants can be selected based on their Lys + phenotype. 
Possible events initiating recombination are discussed in the text). 
If recombination involves identical sequences in sister chromatids, 
it will not result in rearrangements, while recombination within 
die same chromatid or another chromosome can result in deletions 
or translocations. 

creased in the DNA-Pol mutants. (Note: bacterial 
transposon Tn5 was used in yeast studies [5,15] as a 
convenient DNA insert containing IRs flanked by 
short direct repeats (Fig. 1).) While the increase for 
unstable palindromes and quasipalindromes was 
moderate, a very strong effect was observed for the 
Tn5 which is stable in wild type cells. Deletions of 
this IR were increased up to 1000-fold in DNA-Pol 
S and a temperature sensitive mutants [5,15], 
demonstrating that stable ARMs can have a large 
potential for change. 

We found that an IR can stimulate not only 
deletions but also recombination in adjacent regions. 
IRs elevated interchromosomal allelic recombination 
[5], recombination between repeats in nonhomolo- 
gous chromosomes [6] and intrachromosomal recom- 
bination between homologous or highly diverged 
repeats [6,16]. As expected, some of the IR-stimu- 
lated intrachromosomal and interchromosomal 
recombination events were associated with deletions 
and translocations. The presence of a strong bi-direc- 
tional origin of replication in the spacer region within 
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an IR reduced the rates of deletion and recombina- 
tion [6]. This supports the view that IR stimulated 
events are initiated by secondary structures formed 
between single stranded DNA regions of IRs during 
replication. It is interesting that high levels of intra- 
chromosoraal recombination have been found within 
IRs of a long (15 kb) palindrome sequence intro- 
duced into mice [10]. 

We proposed that arrest of the replication com- 
plex at the base of a hairpin formed by IRs in 
ssDNA could initiate IR-associated recombination as 
well as deletions [5-7]. Homologous recombination 
would, therefore, be an alternative to replication 
slippage when DNA elongation is~blocked by a 
non-canonical DNA structure (Fig. 1). Three mecha- 
nisms have been proposed for IR stimulated recom- 
bination, (i) Recombination could be initiated by the 
subsequent formation of double-strand breaks (DSB), 
which have long been known to be highly recombi- 
nagenic [17]. It was suggested [18,19] and recently 
demonstrated in E. coli [20] that impaired DNA 
strand elongation can result in DSBs. Alternatively, 
DSBs could result from cleavage by structure spe- 
cific endonucleases [9], (ii) The stalled 3' end of the 
nascent strand could separate from the template, 
forming a 3'-tail that could invade homologous dou- 



ble-stranded DNA (dsDNA). This is reminiscent of 
the model suggested for bypass of a DNA lesion by 
means of recombination-dependent replication in 
bacteriophage T4 [21]. (iii) A single-stranded region 
that can occur near the secondary structure formed 
by IRs in the template strand could initiate a search 
for homology [7] as a first step in recombination 

Both length of IRs and size of the spacer DNA 
between the repeats can influence deletion and re- 
combination stimulated by this ARM [6]. The inci- 
dence of these genetic changes are directly related to 
the size of IRs and inversely related to the length of 
the spacer separating IRs. We found in this study 
that IRs are capable of stimulating exceptionally 
high levels of recombination leading to deletions and 
translocations in wild type yeast strains. A perfect 
palindrome formed by two head-to-head copies of a 
1.0 kb URA3 gene increased intra- and mterchromo- 
soraal recombination in the adjacent region 2400-fold 
and 17,000-fold, respectively. 

Thus, the IR category of ARMs can have a sub- 
stantial effect on genome stability. Since mutations 
in a DNA replication polymerase appear to greatly 
increase IR associated changes, we suggest that such 
mutants represent a category of 'mutators' that exert 
their ARMs effects by increasing the likelihood of 



Table 1 

Sources of at-risk motif (ARM) instability and effects of mutators 



ARM 
Inverted repeats 

Homonucleotide runs 

Short distant repeats 
(deletions) 

Short distant repeats 
(duplications) 

Minisatellites 



Triplet repeats 



Poor substrate for 
DNA-Pol 

Proofreading 

MMR 



Repair systems other 
than Rad27 (Fen l) 

MMR (deletion) 



Fenl (Rad27) 



Mutators that can destabilize ARM 
DNA-Pol 

MMR 

DNA-Pol 

FEN1 {RAD27) 



DNA-Pol and 
FEN1(RAD27) 

(i) Altematiue pathways 
toFenl(Rad27f\ 

(ii) changes causing more flaps' 1 



Mutator mechanism 
Premutational structures" 

Lack of repair b 

Premutational 
structures 

Lack of repair 



Premutational 
structures 

(0 Lack of repair; 

(ii) Premutational structures 



The sout.es of mutator activity discussed in the text are named as: T^T^-^^ 

non-canonical DNA structures that are poor substrates for DNA metabohsm. Lack of expansion 
prevents a non-canonical DNA structure from becoming a mutation. 'Proposed categones of mutators that could sttmu die p 

are shown by italics. 
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novel intermediates ('Premutation structure' in Table 
1). 

3. Homonucleotide runs and microsateJUtes 

Microsatellites are multiply reiterated short (less 
than 14 bp as suggested by Sia et al. [22]) repeats. In 
humans, they are more variable than nonreiterated 
genomic sequences and the changes are usually due 
to small deletions or insertions of a small number of 
microsatellite units. Variable microsatellites have 
been useful as mapping markers in human genetics 
[23]. 

Microsatellites are extremely unstable in tumor 
cells deficient in post-replication mismatch repair 
(MMR) (for reviews, see Refs. [24-26]). MMR defi- 
ciency in humans was associated with an inherited 
predisposition to several types of cancer. It was 
proposed that the predisposition was due to an in- 
creased likelihood of inactivation of tumor suppres- 
sor genes in MMR deficient cells. Small microsatel- 
lites formed by repeats of one nucleotide (homo- 
nucleotide runs) in the coding sequences (CDS) of 
putative tumor suppressor genes appeared to be pref- 
erential sites of mutational inactivation in MMR 
deficient cells [27-29] suggesting that these might 
represent an important class of ARMs, 

We have systematically examined the stability of 
homonucleotide run ARMs in wild type and MMR 
deficient yeast [30]. Extending the length of a 
homonucleotide sequence leads to an exponential 
increase in mutation rates. However, a MMR defi- 
ciency can greatly amplify the mutation rate. For 
example, the absence of MMR results in a 10,000- 
fold increased mutation rate in A, 3 and A 14 
homonucleotide runs. This is the largest impact of a 
single mutator observed in eukaryotes. For non-ARM 
DNA, strong mutator effects can be observed only 
when both DNA-Pol proofreading and MMR arc 
inactivated [31,32]. The homonucleotide run ARM is 
more effective at increasing mutation rates caused by 
a MMR defect when compared with microsatellite 
ARMs containing longer (2-13 bp) repeat units [26]. 
This is probably due to a higher likelihood of repli- 
cation slippage within the homonucleotide run as 
well as due to more efficient correction of smaller 
loops in wild type cells (see Ref. [33] and discussion 
below). 



3' ->5* 

proofreading 

exonuclease 




Fig. 2. DNA-Pol proofreading and mismatch repair preventing 
mutations caused by replication slippage within microsatellite 
DNA. Presented is a 'bulge' in the nascent strand that can lead to 
insertion of one repeat unit. Both DNA-Pol proofreading 3' - > 5' 
exonuclease and MMR can prevent this mutation. However, when 
a bulge is distant from the 3' end, it is proposed to be inaccessible 
to the proofreading exonuclease. 



Based on results with E. coli and yeast [30,34] 
and with in vitro replication systems [35], microsatel- 
lites are hypermutable in MMR deficient cells be- 
cause frameshift intermediates escape DNA-Pol 
proofreading (Fig. 2). Since proofreading and MMR 
act in series to remove replication errors [31,36], 
escape from DNA-Pol proofreading would leave the 
entire burden of preventing mutations within these 
ARMs to the MMR system. 

While a MMR defect greatly enhances the poten- 
tial of a homonucleotide run ARM to cause muta- 
tion, it differs from DNA-Pol defects on IRs which 
change the incidence of non-canonical DNA struc- 
ture formed by ARMs. The MMR defect does not 
increase the number of—mutation intermediates 
(bulges) in homorjucleotide runs! Instead, the impact 
of the mutator is due to a lack of repair (Table 1) of 
premutationa] structures since mutation intermediates 
that arise in homonucleotide run ARMs are poor 
substrate for proofreading. 



4. Short distant repeats 

Short ( < 10 bp) direct repeats separated by less 
than 100 bp can lead to duplications and extended 
deletions of the DNA between repeats (Fig. 3) in 
various prokaryotes and eukaryotes, especially in 
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unique 
sequence H . 

4-111 bp <100 bp 4-12 bp 




duplication 



Fig. 3. Deletions and duplications associated with short direct 
repeals. Unique sequence that is either deleted or duplicated is 
indicated by the hatched line. The duplications and deletions 
involve gain or loss of one of the short direct repeats (R). The 
sizes of repeats and distances between repeats are based on studies 
in yeast [33,37,38]. 



cells with mutator imitations [8,33,37-40]. These- 
changes most likely occur via end-joining-repair of a 
double-strand break or via replication slippage be- 
tween short repeats. Since the short distant repeats 
are frequent in all genomes (see below), they are an 
important category of ARMs that can cause rear- 
rangements. 

We developed a novel system that enabled us to 
study 31 bp and 61 bp deletions in the yeast LYS2 
gene (Fig. 4 and Refs. £33,37]). Deletions occurred 
between 6 bp or 7 bp short repeats flanking the insert 
that inactivates the LYS2. Deletions of the insert 
restore the LYS2 function and can be selected as 
Lys + revertants. A role for replication slippage was 
supported by the observation that the spectrum of 
deletion breakpoints depended on orientation relative 
to replication origin. A temperature sensitive muta- 
tion in DNA-Pol 5 ( pol3-t) increased deletion rates 
up to 1000 fold, suggesting a role for replication in 
increasing the likelihood of genetic changes in these 
ARMs. (The po!3-t mutation is due to a G to A 
substitution resulting in a Gly^, to Ala 641 change 
near the conserved region VI [7], The mutation 
probably alters a polymerase rather than a proofread- 
ing function, since it is far from the proofreading 
3'-5' exonuclease domain [41]. However, we have 
not excluded intraprotein interactions that might af- 
fect proofreading.) 

We found that unlike for extended deletions be- 
tween distant repeats, rates of -1 frameshifts in 
small (2-4 bp) homonucieotide runs were not af- 
fected by pol3-t [33], However, there was a consid- 
erable potential for po/i-r-stimulated genetic change 
in these runs. The pol3-t mutation caused increased 
rates of frameshifts when MMR was inactivated by 
the msh2, msh3 mutations disrupting recognition of 



a mismatch or pmsl mutation disrupting the second 
stage of MMR complex formation, suggesting that 
the fraraeshift intermediates caused by pol3-t were 
prevented by MMR from becoming mutations. In 
contrast, MMR was completely unable to prevent 
large deletions caused by poL3-t, although MMR 
could in part prevent deletions of 7-bp sequences 
between short repeats also studied in this work (not 
shown on Fig. 4). We proposed (Fig. 4) that un- 
paired DNA loops can be formed by replication 
slippage between repeats and that these loops can be 
repaired by MMR only if they are small (< 8 nt in 
our experiments). The absence of an effect of MMR 
on larger loops (>30 nt in bur experiments) could 
be due to a lack of recognition or because there is no 
directionality of MMR for such loops. Thus, the 
ability of pol3-t to cause high rates of ARM-associ- 
ated changes is due to high likelihood of forming 
mutation intermediate (Premutational structure in 
Table 1) that is a poor substrate for a second DNA 
metabolic system, MMR. 

Another type of genetic change associated witfi 
the short separated repeat category of ARMs is the 
generation of duplications (Figs. 3 and 5). These 
duplications are greatly (> 1000 fold) increased in a 
yeast rod27-null mutant and, similar to poU-t stimu- 
lated deletions, cannot be prevented by MMR [38]. 




Fig. 4. Looped intermediate arising by replication slippage leads 
to deletions between short direct repeats. DNA polymerase (gray 
ball) slippage between two repeats generates an unpaired loop. For 
the model LYS2 system in yeast [33,37], slippage between two 
short (6-7 bp) direct repeats (thick arrows) in the yeast LYS2 
gene separated by either 31 or 61 nt is shown. One of these 
repeats is at the end of an insert (shown by gray line) that 
inactivates the LYS2 gene (Lys~ X Another repeat is in the LYS2 
sequence adjacent to an insert, If the looped intermediate is not 
repaired back to the initial sequence by MMR. the nascent strand 
yields a Lys + revertant. 
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Fig. 5. Defect in removal of a displaced flap can lead to duplica- 
tions between distant short repeats. Presented is a model describ- 
ing how a flap is normally removed. 0) Displacement of the 
5'-end of the downstream Okazaki fragment generates a 5'-flap 
containing genetic redundancy (C and D); 003 the Fenl (Rad27) 
cndonuclease is loaded at the 5'-end of the flap and slides down to 
the base of the flap; Cm) the flap is removed by the Fenl-endo- 
nuclease. If the pathway of flap removal is blocked by mutation 
[38] or by non-canonical DNA structure in the flap (proposed by 
Gordenin el aL [1]). duplications can occur between direct repeats 
(gray thick lines) included in the flap by sequence realignment or 
via end joining if a double-strand break occurs at the base of the 
flap [38]. 

The RAD27 gene (a homologue of human FEN1) 
controls the 5'-flap endonuclease responsible for the 
removal of displaced flaps at the border between 
Okazaki fragments in the lagging strand The unre- 
moved flaps create temporary genetic redundancy 
and can lead to duplications. In the absence of Rad27 
(Fenl), unremoved flaps may be processed via 
error-prone pathways leading to duplications via flap 
realignment or via creation of a DSB followed by 
end-joining (see Fig. 4 in Ref. [38]). 



5. Minisatellites 

Minisatellite DNAs are tandem repeats where the 
repeat unit is longer than the repeats in microsatel- 
lites. The actual distinction between minisateHttes 
and microsateUites has been arbitrary. Some min- 
isatellites are highly unstable in the human genome 
[42]. Genetic control of minisatellite instability ap- 
peared to be different from that of microsateUites. 
Sia et al. [26] showed that inactivation of MMR 
increased up to 6300 fold the rate of changes in 
tandem short repeats (1-13 bp). However, a MMR 
defect did not destabilize multiple tandem repeats 
with longer (16 bp and 20 bp) repeated unit. This 
suggested that the intermediates leading to deletions 
and duplications in minisatellite DNAs (e.g., un- 
paired loops resulting from replication slippage or 
displaced 5'-flaps) are not substrates for MMR. (It 



was suggested [22] that microsateUites and minisatel- 
lites could be distinguished on the basis of ability of 
MMR to prevent repeat deletion and duplication. In 
yeast, such a border would be placed between 13 bp 
and 16 bp repeat sizes). 

As noted above, microsatellite ARMs are highly 
unstable in MMR deficient human cells, which also 
makes them a convenient indicator of MMR defects 
[24,25]. Unlike for microsateUites, no mutator muta- 
tions had been reported that could destabilize min- 
isateDites. Based on the observation that pol3-t 
strongly facilitated deletions [33] and rad27 highly 
increased duplications [38] associated with random 
short repeats and that MMR did not prevent these 
mutations, we proposed that these two mutators could 
increase deletions and duplications in minisatellite 
ARMs [43]. In support of this proposal, pol3-t and 
rad27 were demonstrated to elevate rates of changes 
in the model yeast minisatellite comprised of three 
20 bp repeats (collaborative results of the Petes lab 
and this lab [44]). 



6. Triplet repeats and other small repeats prone 
to expansion 

There is a subset of microsatellite and minisatel- 
lite ARMs in the human genome that are capable of 
not only undergoing small deletions and duplications 
but also of giving rise to large expansions. Increases 
in repeat units vary from several to over 100 fold 
and the probability of expansion increases greatly in 
long repeats [45-49]. Expansions are primarily ob- 
served with the trinucleotide sequences CTG (CAG), 
CCG (CGG) and GAA (TTC), Recendy, expansion 
was also observed in two minisatellites [50-52]. It is 
generally acknowledged that repeats that are prone to 
expansion can form non-canonical DNA structures 
[48,49,53]. We consider the triplet repeats as ARMs 
because it is the arrangement of the sequence that is 
key to expansion. 

As discussed above, repeats can become ARMs 
because they form premutation intermediates that are 
poor substrates for some DNA metabolic activity. 
We have proposed that the non-canonical DNA 
structure that leads to triplet repeat expansion is a 
poor substrate for enzyme(s) that process the 'flap' 
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replication intermediates [1]. DNAs can expand if the 
flap that arises during lagging strand replication is 
not removed by the Rad27 (Fenl) flap endonuclease 
(discussed above and in Refs. [1,38,43]). Since the 
mammalian protein Fenl can process a flap only if 
the flap is completely single-stranded [54,55], a sec- 
ondary structure formed by a flap will be a poor 
substrate for the this enzyme. We have proposed that 
the capability of some repeats to form hairpins, 
triplex DNA or other non-canonical DNA structures 
in the flap would account for specificity of repeat 
expansion [1]. 

Recently, it was demonstrated that radii (fail) 
deficiency in yeast can cause not only duplications 
of unique sequences but also small and large expan- 
sions in CAG (CTG) repeats [56,57]. We anticipate 
additional mutators that can increase the likelihood 
of repeat expansion (Table 1). Mutators could exert 
their effects by increasing the incidence of structures 
that are poor substrates in DNA metabolism, such as 
pol3-t which leads to more replication slippage loops, 
or radii which leads to more flaps. Alternatively, 
mutators could be in genes whose products are in- 
volved in removing or processing replication inter- 
mediates even if they assume non-canonical struc- 
tures. An example of such a mutator is the MMR 
defect described above that causes greatly increased 
mutation rates due to lack of removal of the errors 
that escape DNA-Pol proofreading. Mutators for in- 
creased triplet repeat expansion might be due to 
replication defects that result in a higher incidence of 
strand displacement in the lagging strand (such as 
radii') or loss of a proposed exonuclease that acts 
like Rad27 (Fenl) but its ability to process flaps 
would not be inhibited by secondary structure. 



7. ARMs as sources of human genetic risk 

1.1. Genes with ARMs are highly susceptible to 
change 

Although an ARM is only a small sequence within 
a gene, there are several examples demonstrating that 
the genetic impact of an ARM can be immense 
relative to possible changes in the rest of a gene. 

Deletions involving distantly separated short re- 
peats or duplications in coding sequences can lead to 



loss of gene function. Among 100 bp random se- 
quences, approximately 70% are expected to contain 
a direct repeat of at least 6 bp, suggesting that almost 
all coding sequences contain this ARM [58]. Thus, 
they might have a disproportionate effect on genetic 
change, especially under conditions of altered repli- 
cation. ... e 
Triplet repeat expansion causes inactivation of 
several genes associated with neurological and mus- 
cular function. If the stretch of repeats is long, the 
probability of expansion in the germ line can reach 

nearly 100% [59], 

Several coding sequences contain repeated pep- 
tide motifs encoded by minisatellites with more or 
less perfect identity between repeats [60-62]. Varia- 
tions in minisatellite length are common sources of 
change in encoded protein as described for human 
and porcine mucin genes [63-66]. 

Alu repeats are abundant in the human genome 
and can be associated with rearrangements that lead 
to disease (see examples in Refs. [67-72]). Based on 
Ato-PCR [73] and indirect physical methods [74,75], 
many A/«-repeats form closely spaced IRs although 
they are nonidentical. Several examples of closely 
spaced A/u-IRs have been identified during sequenc- 
ing of the human genome [72,76-79], including 
some that are nearly palindromic [80]. It is worth 
noting that very closely spaced Alu repeats usually 
exist in direct orientation. The survey of large num- 
ber of Aim in the human genome revealed only 47 
out of 634 closely spaced « 21 bp) Alu repeats 
were in the inverted orientation [81]. Such a prefer- 
ence would be expected if very closely spaced long 
IRs in humans are as unstable as in bacteria and 
yeast. IRs formed by Alu and other genomic repeats 
could lead to a high frequency of rearrangements 
especially in mutator backgrounds. 

The long homonucleotide ARMs combined with 
mutators inactivating MMR are especially likely to 
cause gene inactivation. We directly evaluated the 
impact of A,3 and A 14 homonucleotide runs in 
yeast In a mismatch repair deficient (Mmr ) Kmstu) 
strain, mutations that inactivate the LYS2 gene due 
to fraroeshifts in an A l3 run occur 100 times more 
frequently than mutations in the rest of the 4179 bp 
LYS2 coding sequence [30]. Thus, genes with long 
homonucleotide runs will form a gene group that is 
at-risk for inactivation in Mmr" cells. Since many 
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tumors are MMR deficient, it is important to know 
which genes contain long homonucleotide runs in 
their coding sequences. Some of these genes could 
be important for cancer progression and for sec- 
ondary effects of cancer. For example, the coding 
sequence of the structural gene for human parathy- 
roid hormone-like protein (hPTHrP [82-84]) con- 
tains an A n run. Since this protein is associated with 
hypercalcemia of malignancy, important secondary 
effect identified with many tumors [85], the control- 
ling gene could be a frequent target for mutation in 
Mmr~ tumors possibly resulting in the hypercalcina- 
tion syndrome. 

Long homonucleotide runs in coding sequences 
may represent polymorphisms in human populations. 
For example, the A 8 run in the APC gene leads to a 
predisposition to colorectal cancer [86], Hie most 
common allele in the population contains the inter- 
rupted run A 3 TA 4 and both alleles produce func- 
tional protein. In colorectal cancer patients that origi- 
nally carried the A 8 allele, somatic mutations were 
often found inside the A 8 run or in the flanking 
nucleotide. This suggested that the cause of predis- 
position to colorectal cancer is due to a high likeli- 
hood of inactivation of the A 8 allele. Resequencing 
of interrupted homonucleotide runs in coding se- 
quences of human genes from many individuals can 
yield information about potentially harmful polymor- 
phisms. 

It is interesting that homonucleotide runs as well 
as various repeats can pose another kind of threat to 
human health. Pathogenic bacteria can undergo cell 
surface changes that enables them to escape im- 
muno-surveillance as a result of simple frameshift 
mutations [87,88]. The cell changes provide for adap- 
tive evolution so that the 'phase changes* which are 
reversible (by mutation) enable rapid yet relatively 
stable changes to new variants. Reduction in mis- 
match repair would increase the likelihood that these 
ARMs would be mutated. 

7.2. Multiple gene inactivation is increased for genes 
containing ARMs 

Phenotypic changes for a particular characteristic 
may involve mutations in more than one gene so that 
the likelihood of change is the product of mutation 



rates for the individual genes. Since genes containing 
ARMs such as homonucleotide runs are mutation 
prone, the likelihood of more than one gene being 
inactivated or altered is much greater for such genes 
than for genes lacking an ARM. For example, the 
rate of inactivation of the LYS2 gene in a Mmr" 
(msh2) strain is about 10~ 5 ; however, the rate is 
increased to 10" 3 if the gene contains an A !3 in 
frame homonucleotide run [30]. The rate of appear- 
ance of double mutants for the group of genes 
containing the long homonucleotide run would be 
10" 6 as compared to 10" 10 for most genes which 
lack such ARMs, suggesting that multiple mutants 
for these at-risk genes have a reasonable likelihood 
of occurring. 

7 J. Preferential occurrence of premutational DNA 
structures in ARMs 

ARMs, such as homonucleotide runs, could be 
sites within the genome at which premutational 
changes occur preferentially. Based on the high mu- 
tation rates in long homonucleotide runs in MMR 
deficient yeast cells, large numbers of mismatches 
are expected to occur in these ARMs. The human 
genome contains many long homonucleotide runs in 
noncoding sequences. For example, there are up to a 
million AJu-repeats in a diploid human cell and most 
of these contain a long polyA run [74,89]. If human 
cells have the same high rates of mismatch genera- 
tion in long homonucleotide runs as yeast (i.e., 10 3 
for runs containing ^ 13 As) a thousand mismatches 
are expected to be generated during every round of 
replication of the A/us within the human genome. 
Considerable numbers of mismatches are also ex- 
pected to be generated in other microsatellite se- 
quences. The number of mismatches in nonreiterated 
sequences should be much less. Based on rates of 
forward mutation in genes without long homonu- 
cleotide runs in MMR deficient human cells (around 
10" 5 per 1 kb of coding sequence for the HPRT 
gene [90]) it is expected that around 10 mismatches 
would be generated during each replication in the 
~ 10* nucleotides of non-microsatellite sequences in 
the human genome. Thus, mismatches (unpaired 
loops) in homonucleotide and other microsatellite 
ARMs are expected to constitute the predominant 
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substrate for MMR in wild type human cells. Under 
conditions of limited MMR, they might saturate the 
MMR system, and in the absence of MMR, they 
would be primary sites of mutation. If unrepaired 
mismatches lead to secondary events such as DNA 
damage signaling response, these ARMs would be a 
major source of such signals. 

7.4. ARMs can amplify the effects of mutators 

Mutators that can increase the incidence of certain 
premutational structures, such as pol3-t generated 
replication slippage loops [33] or unprocessed flaps 
in the rad27-null mutant-£38] may have a much 
greater effect in ARMs sequences. Premutation struc- 
tures in ARMs are likely to be stabilized or likely to 
lead to secondary events (e.g., due to presence of 
small repeats). The pol3-t and rad27 mutators in- 
crease rates of repeat-associated deletions [33] and 
duplications [38], respectively, about 1000 fold, 
whereas they cause much smaller increases in rates 
of nonspecific mutations leading to gene inactivation 
(< 10 fold increase for pol3-t [91] and a 50-fold 
increase for rad27 [38D. Short quasipalindromes did 
not show a recombinagenic effect in wild type yeast, 
but they did increase recombination more than 10- 
fold in the pott-t background [6]. Mutators that 
occur due to a defect in the repair of premutational 
DNA structures can also cause a greater increase of 
mutation rate in ARMs than in non-ARM DNA. For 
example, the MMR defect caused a 10,000-fold in- 
crease in the mutation rate for long (A l3 and A u ) 
homonucleotide runs, whereas the mutation rates in 
other sequences are increased less than 100-fold [30]. 

7.5. At-risk sequences as ARMs 

There are many sites within the genome where 
specific DNA transactions occur and these might be 
sights for incorrect changes resulting in genome 
instability. For example, breakpoints have been iden- 
tified in lymphoma-associated translocations where 
the DNA sequence is similar to immunoglobulin 
switch recombination sites [92-95]. These transloca- 
tions probably occur as a result of mistakes in the 
immunoglobulin gene switching recombination. 
Hotspots for acridine-induced frameshift mutation in 
T4 phage fit into the category of at-risk consensus 
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sequences. These hotspots, which often correspond 
to cleavage sites for DNA topoisomerase II, probably 
result from an aberrant cleavage reaction [96]. 

8. ARMs control and genome stability 

5.7. Using ARMs to find mutators and vice versa 

The high potential for ARMs to cause genomic 
changes can be used as a tool for identifying muta- 
tors. Microsatellite ARMs are sensitive indicators of 
MMR defects in human cell lines and cancer tissues 
[24,25], as well as microbial systems. The hyper- 
mutability of ARMs in mutators can be used to find 
new mutators, especially if the mutator has only a 
small overall effect on mutations in the genome. For 
example, a mutator defect that leaves only 0.1% of 
mismatches unrepaired in the Iys2-A H allele de- 
scribed above would still be expected to cause a 
10-fold increased mutation rate. This property of 
long homonucleotide runs has, in fact, been used to 
find new mutators that can affect MMR. We found 
that inactivation of the yeast EXOl gene (encoding 
one of redundant 5' to 3' exonucleases) caused a 
100-fold increase in the mutation rate within an A l4 
run, corresponding to only 1% of mismatches being 
unrepaired [97], whereas it had only a small effect in 
other mutation systems. It is unlikely that this gene 
would have been picked up with other mutation 
screens, although other approaches have led to iden- 
tification of EXOl interaction with MMR [98]. The 
long homonucleotide runs have also been useful for 
studying effects of specific mutations in the MSH2 
gene and the identification of subtle changes in 
MMR function (unpublished collaborative results of 
Kunkel and Resnick labs at NIEHS). Based on the 
high sensitivity of the ARM-based mutator screens 
we propose that they can be used to identify very 
weak mutator polymorphisms in the human popula- 
tion. 

Just as ARMs can be been used to identify weak 
mutators, we propose that strong mutators might 
reveal modfs or consensus sequences that are more 
prone to spontaneous or damage induced changes. 
This could be addressed directly by inserting the 
regions of the human genome into model mutation or 
recombination substrates and examining them in var- 
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ious mutator strains. Thus, the mutator strains pro- 
vide 'sensitized' systems for revealing new ARMs. 
This approach is likely to prove useful in the study 
of novel categories of mutators. For example, Glass- 
ner et al. 199] and Xiao and Samson llOO] showed 
that overexpressed yeast protein Magi (3-methyl- 
adenine DNA glycosylase) greatly increases muta- 
tion frequencies in £ coli and yeast. They proposed 
that the mutator effect might be due to the increased 
level of potentially mutagenic basic sites resulting 
from the release of normal bases by Magi. This 
activity might be inefficient in wild type cells ex- 
pressing natural levels of glycosylase [99]. Such a 
release was recently demonstrated for E. coli, yeast, 
and human glycosylases [101]. It would be interest- 
ing to determine if this activity exhibited sequence or 
motif specificity. 

8.2. Environmental factors that amplify the impact of 
ARMs 

ARMs that affect spontaneous changes in the 
genome could also be sites of increased instability 
due to environmental factors. For example, the 
frameshift mutagen ICR- 191 and the carcinogen N- 
2-acetylaminofluorene induces mutation preferen- 
tially in homonucleotide or dinucleotide repeats (re- 
viewed in Ref. [102]). Recently, it was demonstrated 
that minisatellite ARM mutations in cultured mouse 
cells can be facilitated by the tumor promoter okadaic 
acid [103]. Alternatively, there might be motifs that 
only become at-risk when cells are exposed to envi- 
ronmental factors. 

Since altered DNA replication (DNA-Pol, flap 
endonuclease) can increase deletions and recombina- 
tion, drugs affecting replication may increase ARM- 
associated genome instability. Agents such as cytc- 
sine arabinoside and antifolate drugs, which cause 
recombination, amplification and other chromosome 
changes [104-106], would be good candidates. 
Treatment of cultured human cells with antifolates 
and several other agents that interfere with DNA 
replication cause cytologically detectable gaps in the 
regions of chromosomes called fragile sites. Many of 
these sites are formed by ARMs prone to expansion, 
i.e., CCG (CGG) triplet repeats (reviewed in Refs. 
[45,107] or AT-rich minisatellite [50]. Some fragile 
sites are associated with chromosome breakage and 

rearrangements in humans [108,109]. The reactivity 



of fragile sites to replication inhibitors observed in 
cytological preparations suggests mat fragile sites 
may contain ARMs prone to induced changes. 



9. A general view of ARMs 

ARMs potential for causing genomic instability is 
clearly due to an interplay between ARMs and DNA 
metabolic systems. With the sequencing of many 
genomes, including that of humans, and the ability to 
modify DNA metabolic systems, we anticipate the 
identification of new ARMs and factors that affect 
ARMs stability. ARMs may be important compo- 
nents in the response to environmental agents. Inves- 
tigations of the interaction of ARMs with chemicals 
that damage DNA or modify the DNA metabolic 
machinery are likely to reveal novel genetic threats 
and identify genomic regions particularly prone to 
environmentally induced changes. 
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This review summarizes recent data on the development of non-drosophilid insect transform- 
ation systems. The discussion focuses on one particular approach to developing transformation 
systems that relies on the use of short inverted repeat-type transposable elements analogous 
to that employed for DrosophUa melanogaster transformation. Representatives from four fam- 
ilies of short inverted repeat-type transposable dements have been shown to either act as non- 
drosophilid gene vectors or to have the ability to transpose accurately when introduced into 
non-host insect cells. Minos, a member of the Tel family of elements isolated originally from 
D hydei has been successfully used as a germline transformation vector in the Medfly, Ceratttis 
captiata. Hermes, a member of the hAT family of elements isolated originally from Musca 
domestic* has been successfully used as a gene transformation vector in D. melanogaster and 
has a host range that appears to include culicids. hobo, another member of the hAT fanuly 
of elements isolated from D. melanogaster also has a broad host range that Includes tephritid 
fruitflies. mariner(Mos) y a member of the mariner family of elements isolated from D. maunti- 
ana can transpose in calliphorids. Finally, piggyBadlFPl, a member of the TTAA-spedfic 
family of elements isolated from Trichoplusia ni can transpose when introduced into Spodop- 
tera frugiperda cells. Although routine transformation of insects other than ft melanogaster 
is not possible ft is clear that the raw materials for the development of efficient trai»formation 
systems are now available. Copyright © 1996 Elsevier Science Ltd 

Transposable elements Gene vectors Transformation Transgenic insects ^Elements mariner ho- 
bo Hermes Minos Tel Musca domestica Drosophila melanogaster Aedes aegypti ImoIui cupnna— 
Cochliomyia hominivorax Ceratitis capitata Bactrocera tryoni Sterile insect technique Biological con- 
trol 



INTRODUCTION 

The technology for creating transgenic non-drosophild 
insects will have many applications. Some applications, 
such as manipulating Apis mellifera or Bombyx mori 
breeding stocks, are perfectly analogous to the uses of 
transgenic technology in plant and livestock breeding 
programs. Other proposed applications, such as replacing 
populations of pest insects with a genetically engineered 
non-pest strain, are novel and controversial (Collins, 
1994; Curtis, 1994; Spielman, 1994). Recent advances in 
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research to develop this technology has resulted in the 
development of the basic components required for 
efficient and routine introduction of genes into the gen- 
omes of insects of economic and medical importance. 
To a very limited extent these components have been 
assembled and used successfully. 

There are numerous strategies for transforming meta- 
zoans but only a few are likely to find widespread appli- 
cation in insect science. Random integration of DNA 
introduced into germ- or stem-cells provides a workable 
strategy for creating transgenic mammals (Maclean, 
1995) but will only be of limited value in producing 
transgenic arthropods until efficient selection methods 
are developed for recognizing rare transgenic insects. 
Similarly, retroviruses play a major role in efforts to cre- 
ate transgenic animals and animal cells with the excep- 
tion of insects (Maclean, 1995). Until recently the use of 
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retroviral vectors was not an option for insect scientists 
because of the absence of retroviruses that originated 
from or functioned in insects. The recent discovery of 
the first invertebrate (gypsy) (Kim ex al. 1994) and the 
development of pantropic pseudotyped mammalian retro- 
viruses, that have expanded host ranges (Yee et <*/., 
1994), now permit the construction of retroviral insect 
vectors to be seriously explored. A completely novel 
strategy for genetically manipulating the phenolype of 
insects involves genetically manipulating microbial sym- 
bionts that inhabit various insect tissues (Beard ei aL, 
1993). Such insects might be referred to as paratransg- 
enic insects. One strategy, however, that is certain to be 
generally useful in producing transgenic insects, and is 
the exclusive focus of this review (Table I ), is the con- 
struction of insect gene vectors from transposablc 
elements containing short inverted terminal repeats. 

Transposable elements include a diverse collection of 
genetic elements which have in common the ability to 
promote recombination reactions that result in the move- 
ment of the element from one location in the genome to 
another. Structurally and mechanistically this is 
accomplished in a variety of ways. An operational classi- 
fication system that has proven convenient in discussing 
transposahle elements divides them into two classes 
depending upon the general features of the mechanisms 
of movement (Finnegan, 1989). Class I elements arc 
those that transpose by reverse transcription of an RNA 
intermediate, similar to retroviral integration. Class II 
elements appear to transpose directly from DNA to DNA. 
Class 11 elements are usually of small to moderate length 
(less than 10 kb). A subclass of these elements have in 
common the presence of terminal sequences that form 
inverted repeats. These terminal inverted repeats are usu- 
ally (but not always) less than 50 base pairs and can be 
as few as eight base pairs. Members of this subclass of 
elements are called short inverted repeat-type elements 
and have proven most useful in constructing gene vectors 
for Droaophiia and are becoming increasingly important 
for plant genome manipulation (Hehl, 1994; Walhoi, 
1992). 

Short inverted repeat-type elements have two major 
functional components, a transcription unit that encodes 



a protein (transposase) required for transpositional 
recombination and the terminal inverted repeats of the 
element including 100 bases or more of terminus-flank- 
ing DNA. Although not well characterized in cukaryotcs. 
transposases are probably recombinases. Likewise the 
exact role of the terminal repeats of this class of element 
is not known but they appear to serve as pointers, 
directing the recombinase and/or other factors to the pro- 
per position on the element to initiate recombination. 
Short inverted-repeat type elements are particularly 
amenable to being converted to vectors because the two 
components of the element can function in trans. That 
is, transposase produced by one element can act on the 
terminal sequences of another element of the same type 
and promote its transposition. Consequently, attaching 
the terminal sequences to any other non- transposable 
element sequence results in the creation of a chimeric 
transposable element capable of transposing when the 
appropriate transposase is present. Trans-mobilization is 
a useful feature of this type of transposablc clement 
because it permits the investigator to stabilize the move- 
ment of the element following integration by removing 
the source of transposase. These characteristics were first 
exploited in eukaryotes for the purposes of gene vector 
development using the P-element from Drosophila mel- 
anogaster (Rubin and Spradling. 1982; Spradling and 
Rubin, 1982) and we refer to this as the P-elemenl para- 
digm. 

The P-element paradigm for genetic transformation of 
I), melanogaster can be simply described as one that 
relies on a modified short inverted repeat-type transpos- 
able element in which the transposase function is pro- 
vided in trans to the essential terminal sequences. The 
terminal sequences are attached to the gene or DNA 
sequence to be integrated, forming a chimeric transpos- 
able element (now called a vector). Upon entry into a 
nucleus, transposase promotes the cutting and joining of 
the vector to a chromosome of the host, resulting in 
chromosomal integration of the vector. Because move- 
ment of the vector does not involve an RNA intermedi- 
ate, the types of sequences that can be included in the 
vector are not severely limited and may include introns. 
A particularly important feature of this paradigm is the 



TABLE 1. Short inverted repeat typc-iransposable elements capable of transposition in nun-host species 



Element Most Species Non Host Species References 



hoba D. melanogasttr M. dottiest tea' B. tryoni 1 H. 'O'Bmchta pi aL I W; J Pinkertoo erf 

armigera 2 D. viriliP ai t 19%; 'Lozovskaya et al. y 19% 

Hermes M. domestica O, melanogaster* B. tryoni* C. 'OHrochta et 1996; Sarkar ei <il. 

mace.Uaria* L cupruuj* Ae. aegyptC' unpublished 

Minos IX hvdei A mrtanogaxter" C. capitata' 1 "Loukcris et at.. 1995a; 'Ixmkciis e.i 

uL 1995b 

mariner D. mauritiana D. melanogaster* L atprina* "Lidholm W u/.. 1993: "Coates et ul.. 

unpublished 

/' D. melanogastcr tl hu\vaiiensh Ut O. simuUins'* to Brennan et uL. 1984; "Scavarda 

and Hartl, 1984 

piggyBac/IFP2 T. ni S. jrugipe.rda^ '-Fraser et «/.. 1995 
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use of host strains of A melanogaster that do not contain 
endogenous sources of the appropriate transposase or 
forms of the element that might inhibit vector integration. 
This paradigm for gene transfer insures that the inte- 
grated vector will not continue to transpose and will not 
be lost through excision. The existence of D. melanogas- 
ter strains lacking specific families of transposablc 
elements has been critical in the successful development 
of gene-transfer technology in this species and has per- 
mitted the development of additional vector-based tech- 
nologies that would not have been possible if stability 
could not be assured. 

The P-element has proven useful in manipulating the 
genome of D. melanogaster in ways that go beyond its 
straightforward use as a gene vector and a simple tool 
for integrating and expressing foreign DNA. Two of the 
more notable applications are transposon tagging 
(Bingham et at., 1981; Searles etai, 1982) and enhancer 
trapping (Bellen et «/., 1989). These are powerful 
methods using modified gene vectors for identifying and 
isolating genes. The diversity and effectiveness of tools 
that have been developed using P-elements to investigate 
the biology of D. melanogaster illustrates the utility of 
short inverted repeat-type transposable elements and jus- 
tifies current efforts to follow the P-element paradigm so 
that analogous tools can be developed for other insects. 
Of course, the effectiveness of these tools in D. mel- 
anogaster is due, in part, to our sophisticated understand- 
ing of Drosophila genetics and our ability to perform 
elegant genetic experiments in this species. Conse- 
quently, the development and use of gene transfer and 
related technologies will also require an increase in our 
efforts to develop a foundation of generics for non-droso- 
philid insects. In those insects where there already exists 
a foundation of genetic information, gene transfer tech- 
nology is likely to have a rapid, positive impact that will 
begin in the laboratory and may extend to the field. 



PROGRESS TOWARDS FUNCTIONAL NON- 
DROSOPHILID INSECT VECTOR SYSTEMS 

The power and versatility of the P-element paradigm 
has fueled efforts to develop analogous systems for non- 
drosophilid insects. These efforts have proceeded in two 
general directions. The first has relied on exploiting 
transposable elements that were isolated from LX mel- 
anogaster or related species and testing their abilities to 
excise, transpose and act as gene vectors in various non- 
drosophilid insects. The second strategy has involved 
identification and isolation of new transposable elements 
from non-drosophilid insects that might be converted to 
gene vectors. Both strategics have proven successful and 
operative vectors are now available for a small number 
of species. 



THE P-ELEMENT FAMILY 

Description 

P elements arc the prototypical short inverted repeat- 
type element in cukaryotes; they are almost 3 kb in 
length, encode a single protein called transposase and 
have terminal inverted repeat sequences of 31 nucleo- 
tides. The discovery, isolation and characterization of this 
element has been reviewed elsewhere (Engels, 1989). 
While P-elements provide us with an example of how an 
effective insect transformation system might be con- 
structed and applied they are currently of only minor 
interest to those interested in non-drosophilid insect gene 
vectors. The P-element 's diminishing significance to cur- 
rent efforts to develop non-drosophilid transformation 
technology is due to the limited mobility properties of 
these elements in non-drosophilid species. 

Mobility in the host 

The ^-element's mobility in D. melanogaster is well 
studied and documented (Engels. 19X9). Movement of P- 
elements is strain-dependent with certain strains 
repressing almost all P movement and others supporting 
high rates of P-element excision and transposition. 
Within D. melanogaster, P-clcments are effective agents 
for genome manipulation because of their rates of move- 
ment and our abilities to regulate them. Details of these 
mobility properties can be found elsewhere (Rio, 1991). 

Mobility in other insects 

The D. melanogaster P-element has been successfully 
used as a transformation vector in the closely related 
species, D. simulans (Scavarda and Hartl, 1984) and a 
more distantly related drosophilid D. hawaiiensis 
(Brennan et a/., 1984). Neither of these species possess 
endogenous P-elements and, in both cases, P-elements 
could undergo transpositions in subsequent generations 
suggesting that both the transposase and host-supplied 
factors were present and capable of functioning in 
these species. 

All efforts to employ P-elements as gene vectors in 
non-drosophilid insect species have failed. The success- 
ful creation of transgenic non-drosophilid insects using 
P-element vector-containing plasmids has been reported 
but in no case has the integrated DNA consisted of only 
sequences precisely delimited by the termini of P- 
eiemcnts (McCrane et uL 1988; Miller e\al y 1987; Mor- 
ris ex a/., 1989). Most efforts to use P-elements in non- 
drosophilids have remained unpublished because trans- 
genic insects were not successfully produced and no evi- 
dence of movement was obtained. References to these 
efforts can be found occasionally [see Walker (1989) for 
mention of efforts to transform Locusta migratoria and 
Apis mellifera\ indicating that between 1985 and 1990 
there was considerable effort to use P-elements in non- 
drosophilids. The silence that followed those efforts is 
evidence of their failure. In addition, using in vivo plas- 
mid-based recombination assays the excision or transpo- 
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RGURE I. Transient iransposabJe elemem mobility assays performed 
in vivo. (A) Transposable element excision assays. The assay shown 
is designed to monitor the excision of a hobo element containing the 
gene encoding 0-galactosidase although other genetic markers could 
be used. Excision indicator and helper plasmids are coinjected into 
preblasioderm insect embryos which are then allowed to develop. Plas- 
mids are recovered and recombinant plasmids resulting from element 
excision are identified by their unique genotype. By omitting the plas- 
mid encoding transposase the assay can be used to look for the pres- 
ence of endogenous transposase- like activity in ahemative insect hosts. 
For full details see Atkinson et a/., 1993. Amp-R - gene encoding 
ampicillin resistance; Tex-R = gene encoding tetracycline resistance. 
(B) Transposable element tiansposition assay. The assay shown is 
designed to monitor the transposition of a hobo elemem containing a 



sition of P-elements (Fig. 1) following their introduction 
into non-drosophilid host insects was never detected 
(O'Brochta and Handler, 1988). Analysis of the P- 
element's ability to excise in non-hosts, including non- 
drosophilids, using plasmid-based excision assays 
revealed a decreased ability to support movement as 
phylogenetic distance from D. melanogaster increased. 
All non-drosophilid insects tested were unable to support 
P-element excision (O'Brochta and Handler, 1988). Col- 
lectively, these data, suggested that, in the absence of 
significant modification, P-elements were unlikely to be 
useful as transformation vectors in non-drosophilid 
insect species. 

Related elements in other insects 

P-elements have been found in a number of drosophi- 
lid species including Scaptomyw pallida. D. paulistorum, 
D. bifasciata. D. nebulosa and LX willistoni 
(Anoxolebehere and Periquet, 1987; Daniels et aL 1984, 
1990; Hagemann et aL 1990; Lansman et aL 1987). 
Their discontinuous distribution within this family, 
together with the high degree of sequence similarity 
observed among elements from different species suggests 
that horizontal transfer has, in part, been responsible for 
the distribution of P-elements in the Drosophilidae (Clark 
et aL 1994; KidwelL 1992). Until recently the distri- 
bution of P-elements and P-element-related sequences 
was believed to be rather limited. However, P-element- 
related sequences have been found in Lucilia cuprina 
(Perkins and Howells, 1992) and other cycloraphous dip- 
tera (H. Robertson, personal communication). 

THE hobo OR hat ELEMENT FAMILY 

Description 

The hobo element was identified in D. melanogaster 
as a factor causing genetic instabilities when certain 
strains were hybridized (Lim et a/., 1983; Stamatis et aL 
1981). Similar instabilities had been found associated 
with the movement of P-elements. The physical isolation 
of hobo occurred during the investigation of glue-protein 
genes of D. melanogaster and was discovered as an inser- 
tion element in a spontaneous allele of Sgs-4 (McGinnis 
et aL 1983). 

hobo is a typical short inverted repeat-type element 
with a copy number of 0-50 per genome, depending on 



gene encoding resistance to kanamycin [Kan-R) from the donor plas- 
mid to a target plasmid. Transposition is mediated by the transposase 
encoded by the helper plasmid. In this version of the assay, the target 
plasmid contains the sucmseRB gene of Bacillus subritis. inactivation 
of which can be selected for when Eschericia coii cells containing it 
are plated on appropriate media. Transposition is confirmed hy 
uetermining the DNA sequence of the two junction points of hobo 
transposable element integration into the sacRB gene. For full details 
see O'Brochta et aL 1994. This strategy for assessing transposable 
clement mobility can be employed for any short inverted repeat-type 
element and as described in the text, related assays arc now available 
for Hermes and mariner elements. 
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the strain. It is approximately 3 kb in length with 12 bp 
terminal inverted repeats, creates 8 bp duplications at the 
points of integration and probably encodes a single pro- 
tein (transposase) (Blackman and Gelbart, 1989). Analy- 
sis of an autonomous hobo element (HFL1) revealed 
sequence similarities with Ac from Zea mays (com) and 
Tarn 3 from Antirrhinum majus (snapdragon) (Calvi et aL 
1991; Feldmar and Kunze, 1991). The DNA sequence of 
HFL1 differs from the original description of the hobo 
element (hobo l( >tt) in five regions, most significantly in 
that the transposase open reading frame is extended by 
42 amino acids at the carboxy end of the polypeptide. 
The similarities among hobo t Ac and Tam3 are largely 
confined to three regions of the transposase open reading 
frame and consist of blocks of approximately 50 amino 
acids that are 35% identical and 50% similar to corre- 
sponding regions of Ac and Tam3. These data suggest 
that hobo % Ac and Tam3 are related members of a family 
of transposable elements and for convenience we refer 
to this as the Mr family. 

Although within the family Drosophilidae hobo has a 
rather limited distribution, the hAT family of transposable 
elements appears widespread in insects (see next 
section). Of 134 species of Drosophila analyzed, hobo 
sequences were found in only the meianogaster species 
group and confined to the meianogaster and montium 
subgroups (Daniels et aL 1990). As with P-elements 
there were strains of meianogaster that appeared to com- 
pletely lack the hobo element and these were usually 
older laboratory strains. These data and a comparison of 
the hobo elements found in other Drosophila species has 
led to the hypothesis that hobo has been recently (within 
the last 100 years) introduced into D. meianogaster by 
horizontal transfer (Simmons, 1992). 

Mobility in the host 

Parallels between structure, distribution and genetics 
of P- and hobo elements suggested that hobo too could 
be converted to a useful gene vector. Following vector 
design, use and detection strategies that precisely paral- 
leled the P-element paradigm, Blackman et aL (1989) 
used hobo as a gene vector in IX meianogaster. hobo's 
performance was comparable to that of P-elements' with 
approximately 20% or more of the fertile adults 
developing from embryos injected with the vector pro- 
ducing transgenic progeny. Transposition involved only 
sequences delimited by the terminal inverted repeats of 
the element and an 8 bp direct duplication of the inte- 
gration site was created. Furthermore, hobo mobility is 
confined predominantly to the germline by virtue of tis- 
sue specific expression of the hobo tTansposase gene 
(Calvi and Gelbart, 1994). While hobo integration rates 
are comparable to that observed using P-elements, inser- 
tion site preferences of these two elements are distinct. 
The spatial distribution of hobo and P-element inte- 
gration sites within the Drosophila genome overlapped, 
but there were chromosomal regions that were preferred 
by only one of the elements (Smith et aL 1993). This 



difference in insertion site preference is a useful charac- 
teristic of hobo vectors because more comprehensive 
transposon mutagenesis efforts can now be undertaken in 
D. meianogaster. 

Although little experimental work on the mechanism 
of hobo movement exists, recent analysis of hobo move- 
ment in D. meianogaster revealed excision reaction-pro- 
ducts similar to those recovered following Ac and Tam3 
excision in their respective hosts (Atkinson et aL, 1993). 
This similarity supports the hypothesis that Ac . Tam3 and 
hobo are related members of a family of elements that 
move using a common mechanism. 

Mobility in non-hosts 

The resemblance of hobo to Ac and Tam3 suggested 
that hobo may share additional properties with Ac and 
Tam3. such as the ability to transpose when introduced 
into species which do not normally contain the clement 
(non-host species). The Ac transposable element system 
from maize has an almost unrestricted ability to function 
in non-host plant cells. Ac, although originally isolated 
from maize, can excise and/or transpose when introduced 
into many species of monocot- and dicotyledonous plants 
including tobacco, tomato and petunia (Baker et aL, 
1986; Chuck et aL, 1993; Peterson and Yoder, 1993). 
Ac's ability to function in non-host species has led to its 
use as an effective transposon tagging agent in Arabi- 
dopsis thaliana and other plant species (Aarts et aL, 
1993; Hehl, 1994). The first direct demonstration of 
hobo's ability to function in a non-host insect was 
reported by Atkinson et aL (1993). They demonstrated 
that hobo could excise from plasmids introduced into and 
transiently maintained extrachromosomally in cells of 
developing Musca domestica (housefly) embryos (Fig. 
1). Similar results were reported in distantly related Dro- 
sophila species (Handler and Gomez, 1995) and currently 
the range of species in which hobo has been shown to 
excise includes the mosquitos, Aedes aegypti and Aedes 
australis (A. Sarkar, K. Yardley, K. Saville, D. 
O'Brochta, A. James and P. Atkinson, unpublished 
results), and the lepidopteran. Helicoverpa armigera 
(Pinkerton et oi„ 1996). 

hobo excision is a good indicator of the presence of 
hobo transposase or transposase-like activity but exper- 
iments that directly measure hobo transposition are more 
valuable in determining the potential of the element to 
serve as a vector. O'Brochta et aL (1994) used a modifi- 
cation of the excision assay to directly measure hobo 
transposition in embryos of host and non-host species 
(Tig. 1). In D. meianogaster hobo transposition was 
transposase-dependent, involved only DNA delimited by 
the 12 bp inverted terminal repeats of the hobo element 
and resulted in the creation of an 8 bp duplication of the 
insertion site. The consensus integration site was similar 
to genomic integration sites reported by Streck et aL 
(1986) in D. meianogaster. Therefore the plasmid-based 
hobo transposition assay is a reliable indicator of element 
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behavior and can be used to assess hobo mobility in 
insects other than Drosophila. 

Interplasmid transposition assays in a number of 
diverse non-drosophilid insect species including the 
housefly. M. domestica (family, Muscidae), the Queens- 
land fruitfly, Bactrovera tryoni (family, Tephritidae) and 
the corn earworm, H. armigera (order, Lcpidoptera) 
demonstrated hobo's ability to transpose accurately in 
non-host insects (O'Brochta et al, 1994; Pinkerton et al, 
1996). The frequency of hobo transposition in non-droso- 
philids was less than that observed in D. melanogaster 
under similar conditions but, as in D. melanogaster, 
transposition involved only sequences precisely 
delimited by the terminal inverted repeals and integration 
occurred into 8 bp target sites resembling those used in 
the host species. These data demonstrate the ability of 
hobo to function in widely diverged non-host species and 
strongly suggest thai it will be capable of serving as a 
germ-line transformation vector in these and related spec- 
ies. 

Recently Lozovskaya et at. (1996) successfully used 
hobo as a germ-line transformation vector in D, virilis, 
a species approximately 40 million years diverged from 
D. melanogaster. Although only two of the 398 germ- 
lines tested produced transgenic progeny, integration 
resulted from precise transpositional recombination with 
ail of the hallmarks described above. The sequence of 
the 8 bp target site duplication closely resembled the con- 
sensus target reported by others. Although the I). viriiis 
strain used in these experiments was not examined for 
the presence of endogenous hobo elements others have 
found no physical evidence for such elements in this 
species. Handler and Gome/ (1995) have, however, 
reported the presence of a /wfoHransposase-like activity 
which might originate from a related transposable 
element system. 

The tephritid fruitfly, B. tryoni has recently been trans- 
formed using a hobo vector containing a gene conferring 
resistance to the antibiotic G418 as a selective marker (S. 
Whyard, D. O'Bmchta, P. Atkinson, unpublished data). 
Transgenic Gi insects were obtained at a moderate fre- 
quency (9.6%) and some of these integrations appear to 
have resulted from an interaction between the hobo vec- 
tor and an endogenous hobo-Ukt element (called Homer) 
known to exist in this species (see the following section). 
The nature of this interaction is currently under investi- 
gation but it has resulted in the inability to establish a 
stable transgenic line from these clearly transgenic indi- 
viduals (S. Whyard, D. CTBrochla, P. Atkinson, unpub- 
lished data). 

Related elements in other species 

Atkinson et al. (1993) showed that hobo excision in M. 
domestica did not require a source of hobo transposase to 
recover hobo excision-reaction products. Because hobo 
excision in the host, D. melanogaster, was transposase- 
dependent Atkinson et at. (1993) proposed that M. dom- 
estica embryos must contain an endogenous source of 



hobo transposase-likc activity. They went on to show that 
Af domestica contains a functional transposable element 
system (called Hermes) closely related to hobo (Warren 
et al, 1994). Using a similar approach, the presence of 
a /wibo-related systems have been detected in the droso- 
philids D. virilis, D. melanica. D. replete, D. saltans, D. 
mllistoni and Chymomyza procnemis (Handler and 
Gomez, 1995), the tcphritids Bactrocera tryoni (S. 
Whyard, A. Pinkerton, D. O'Brochta and P. Atkinson, 
unpublished data), B. curcubitae, B. dorsalis* C. capitata 
and Toxotrypana cunncauda (A. Handler, personal 
communication) and the mosquitos Aedes aegypti and 
Aedes australis (A. Sarkar, K. Saville, K. YardJey, A. 
James. P. Atkinson and D. O'Brochta, unpublished data). 
Hence, using hobo excision as a bioassay for hobo or 
hobo-Wkt transposase activity, a number of investigations 
have been able to detect the presence of hoho-Ukz trans- 
posable element systems in species distantly related to 
D. melanogaster supporting the idea that hAT elements 
are widespread. The following is a brief description of 
hAT elements subsequently isolated from non-drosophi- 
lid insects. 

Hermes (M. domestica). The presence of Hermes was 
originally inferred from genetic data obtained from M. 
domestica (described above) and isolated using the PCR 
with partially degenerate oligonucleotide primers hom- 
ologous to conserved regions of hobo, Ac and Tarn J 
transposases (Atkinson et al, 1993). Warren el al { 1994) 
isolated and sequenced overlapping segments of several 
Hermes elements from M. domestica and constructed a 
2749 bp consensus sequence. Full length Hermes 
elements contain a single long open reading frame (ORF) 
which, when conceptually translated, yields a protein of 
612 amino acids that is 55% identical (71% similar) to 
hobo transposase (Fig. 2). Hermes is flanked by 17 bp 
imperfect terminal inverted repeats that are almost ident- 
ical to those of hobo's and clearly related to the inverted 
terminal repeats of Ac and Tam3 (Fig. 3). Comparison 
of Hermes, hobo, Ac, Tam3 and an Ac-like element from 
pearl millet showed thai the transposase of Hermes is 
most similar to that of hobo (Warren ei al. 1994). Lim- 
ited efforts were made to assess Hermes sequence vari- 
ation between M. dotnestica strains, and of eight strains 
examined, all showed evidence of at least one full-length 
or near full-length element. Six of the eight strains con- 
tained internally deleted Hermes elements while the 
remaining two appeared to contain only intact elements 
(Warren et al, 1994). Therefore, Hermes is a transpos- 
able element that is closely related to hobo and based 
on structural criteria appeared functional. Direct lesls of 
Hermes mobility, described in the next scclion, con- 
firmed this hypothesis. 

Hermit (L. cuprina) 

hermit was isolated from the genome of the Australian 
sheep blowfly, L cuprina (family, Calltphoridae) by 
screening a genomic DNA library with hobo sequences 
under conditions of reduced stringency (Coates et al. 
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two deletions within the putative ORF of the clement. 
One deletion removes the N-terminal region of the pro- 
tein while the second removes 18 amino acids from a 
region that is highly variable in all known M7"-element 
transposases. When comparing overall nucleotide ident- 
ity hermit is 49%, identical to hobo and 51% identical 
to Hermes. Considering only the nucleotide sequence of 
the ORFs, hermit is 54% identical to hobo and 53% 
identical to Hermes (Fig. 2). Conceptual translation of 
the ORF within hermit results in a protein that is 64% 
similar and 42% identical to hobo transposase and 61% 
similar and 41% identical to the Hermes transposase. 



Homer J and Homer 2 (B. tryoni) 

As for the Hermes clement of M. domestica, the pres- 
ence of hAT-Wke elements in B. tryoni was originally 
inferred from hobo excision assays which showed that 
the hobo element could be recognized and mobilized by 
endogenous factors in developing embryos of B. tryoni. 
'^2$? As with Hermes these hAT elements were isolated using 
a PCR-based approach. Unlike Hermes, Hector and her- 
mit, however, there are at least two related forms of the 
hAT element present in B. tryoni, called Homer I and 
Homer 2 (A. Pinkerton, S. Whyard, D. O Brochta and P. 
Atkinson, unpublished results). The presence of more 
than one form of transposable element in one species has 
been observed frequently for mariner elements but has 
not been previously recorded for hAT elements. Homer 2 
has an intact ORF suggesting that it encodes a functional 
FIGURE 2. Alignment of the products from conceptual translation ol transposase. The transposases of Homer I and Homer 2 
the open reading frames of the [lwbo\ Hermes, Homer /, Homer 2 ^ . 49% identica , tQ each Qther bul each j s approxi- 

Ht'Lt and hermit transposable elements showing the high levels ot «™ ... , , . T j . 

... * . ■ | . , . ... °. r . matelv 70% similar to hobo and Hermes transposases. 

identity and similarity between them. Alignments were performed ,,,aifcl J ahihhu «• r 

using the GCG softwkrc package from the University of Wisconsin. Both elements are present in multiple copies within the 
Madison. Wl and displayed using BoxShade 3.0. genome of B. tryoni\ there are approximately 5-10 copies 

of Homer I and 20 copies of Homer 2 per genome. 
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FIGURE 3. Alignment of hAT clement inverted terminal Tepcat 
sequences showing their similarity Note that the Homer I sequence 
is provisional since the right hand inverted terminal repeat has yet to 
be identified. 



1996). A single clone was recovered and DNA sequence 
analysis confirmed that it was a transposable element 
(hermit) 2716 bp in length with 15 bp terminal inverted 
repeats. Ten of the first 1 2 terminal nucleotides of hermit 
are identical to the corresponding nucleotides of the hobo 
element (Fig. 3), As with Hermes this similarity extended 
to other members of the hAT family. Southern blot analy- 
sis of genomic DNA indicated that hermit was present 
once in the genome of the strain of L cuprina from 
which it was originally isolated and in 10 other strains 
from Australia, New Zealand and South Africa. 

Hermit is 243 bp shorter than a full-length hobo 
element with most of this difference accounted for by 



Hector (M. vetustissima) 

Using a PCR-based strategy, similar to that employed 
to isolate Hermes from M. domestica. Warren ex al 
(1995) identified a sequence, called Hector, from the 
Australian bush fly, M. vetustissima, thai is highly similar 
to hobo and Hermes. The hAT transposase-like sequence 
from Af. vetustissima, when conceptually translated, 
encodes a protein 62% identical and 72% similar to the 
corresponding region of hobo. Although M. vetustissima 
and Af. domestica are very closely related species Hector 
appears as similar to hobo as it is to Hermes. Surpris- 
ingly, Hector is a single copy sequence in the Af vetustis- 
sima strain from which it was isolated (Warren et a/., 
1995). 

Other insect hAT elements 

Fragments of putative MI elements have been isolated 
from other insect species. Using a PCR-based approach 
similar to that described above the existence of hAT 
elements in a number of tephritid species including the 
melon fly, B. cucurbitae, the oriental fruit fly, B. dorsatis, 
the Medfly, C. capitata and the Caribbean fruit fly, Anas- 
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trapha suspensa has been detected (A. Handler and S. 
Gome/, personal communication). Interestingly, over the 
short region of amino acid sequence available, the puta- 
tive melon fly element is 74% identical and 86% similar 
to the corresponding region of the ORF of Homer 2 from 
B. tryoni, Evidence for HAT elements in insect orders 
other than diptera is quite limited (DeVault and Narang, 
1994) and likely reflects the limits of efforts to find these 
elements and not the limits of their distribution. 

Mobility of non-drosophilid hAT elements 

Direct evidence for mobility of HAT elements other 
than hobo is currently only available for Hermes. This 
element is transpositionally active in its host, Af. dom- 
estica and in diverged non-host species (O'Brochta et aL. 
1996). Using plasmid-based extrachromosoraal transpo- 
sition assays that have been so effective in assessing 
other short inverted repeat-type elements, the mobility of 
Hermes was assessed in Af. domestica. Hermes accu- 
rately transposed in this species and the recombination 
products had all of the hallmarks of a genuine transpo- 
sition event, including 8 bp insertion site duplications. 
The consensus integration site closely resembled that 
obtained for hobo elements (O'Brochta et «/., 1996). 
These data are significant because they demonstrate that 
Hermes is a functional transposable element from a non- 
drosophilid insect. Furthermore, these interplasmid trans- 
position data were obtained from Af. domesiica that con- 
tain active endogenous Hermes elements. The ability of 
Hermes to transpose in such a strain suggests that, in 
Af. domestica, any host-encoded molecules involved in 
repression or regulation of the element are insufficient to 
prevent transposition in these assays. 

Hermes, like hobo, Ac and Tam3 can also transpose in 
divergent species. The mobility of Hermes in non-host 
species is being tested using interplasmid transposition 
assays and by using it as a germ-line transformation vec- 
tor. Using interplasmid transposition assays, accurate 
Hermes transposition has been detected in L cuprina and 
Cochliomyia macellaria (family, Calliphoridae), B. 
tryoni, Ceratins capitata (family, Tephritidae), D. met- 
anogaster (family, Drosophilidae), Ae. aegypti (family, 
Culicidae) (A. Sarkar, C Coates, J. Scura, K. Yardley, 
A. James, P. Atkinson and D. O'Brochta, unpublished 
results) and Helicoverpa armigera (order Lepidoptcra, 
family Noctuidae) (Pinkerton et aL 1996). Hermes trans- 
positions in all cases created 8 bp insertion-site dupli- 
cations within the target DNA, the consensus of which 
was similar to the target-site consensus sequence reported 
for hobo, In the case of A melanogaster and B. tryoni, 
interplasmid transposition events were recovered at least 
10 limes more frequently than when similar experiments 
were performed in these species using hobo. Li Ae. 
aegypti the rate of Hermes transposition was approxi- 
mately 100-fold less than that observed for Hermes in D. 
melanogaster. Hermes appears to have a wide host range 
as well as elevated levels of activity when introduced 
into some non-host species. This property will make it 



particularly useful in deploying Hermes as a non-droso- 
philid gene vector. 

Hermes has also been used as a germ-line transform- 
ation vector in D. melanogaster (O'Brochta et al. 19%). 
Following the P-element paradigm, a binary vector sys- 
tem was constructed consisting of a non-autonomous; 
Hermes element containing the IX melanogaster mini- 
white gene and a Hermes helper plasmid consisting of 
the Hermes transposase open reading frame under the 
control of the D. melunogaster iisp70 promoter. Using 
standard protocols, a mixture of these plasmids was 
injected into preblastoderm embryos homozygous for a 
null mutation in white. The results confirmed that Hermes 
is an active element having mobility properties parti- 
cularly useful for gene vector development. Of the 124 
fertile G 0 adults recovered and individually tested, 43 
(35%) yielded at least one transgenic, progeny. This 
transformation rate is within the range of rates usually 
encountered when using P-elements as a vector in this 
species. Hermes also showed evidence of being hyperac- 
tive in Drosophila (relative to P-elements). This was 
reflected in the high frequency of multiple integrations 
within single germiines. That is, 36 (84%) of the 43 
transgenic G*, germiines produced individuals containing 
two or more integrated Hermes elements. Furthermore, 
most (60%) of the G 0 transgenic germiines produced 
clusters of transgenic progeny. A cluster was a group of 
transgenic G t progeny comprising 10% or more of the 
total progeny arising from a single G 0 germline. In a 
small number of cases 90% or more of the progeny were 
transgenic indicating that almost every surviving germ 
cell following injection acquired an integrated Hermes 
element (O'Brochta et aL 1996). The presence of clus- 
ters could reflect not only high rates of integration, but 
the integration of the transgene soon after being intro- 
duced into the embryo. In D. melanogaster, Hermes 
appears as active as any of the current gene vectors 
derived from P. hobo or mariner elements. Integrated 
Hermes elements were stable in the absence of Hermes 
transposase but remained capable of remobilization if 
Hermes transposase was reintroduced. All integrated 
Hermes elements examined were products of transpo- 
sitional recombination involving only sequences pre- 
cisely delimited by Hermes termini and resulted in the 
creation of characteristic 8 bp insertion site duplications. 
Little information is available regarding interactions 
between endogenous hobo elements and integrated 
Hermes elements since the transgenic D. tnelanogaster 
containing the Hermes vector was created in a strain 
lacking hobo elements. However, following the introduc- 
tion of hobo transposase no Hermes revertants 
(excisions) were observed after screening 7228 chromo- 
somes (MM. Green, personal communication). When 
more sensitive plasmid-based excision assays are perfor- 
med Hermes excision mediated by hobo transposase was 
observed at a rate of approximately 1 excision event per 
5000 donor plasmids screened. (P. Sundararajan. P. 
Atkinson and D. O'Brochta, unpublished data). This is 
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approximately 10 fold less than the rate of Hemes exci- 
sion observed in the presence of Hermes transposase in 
this species but it does provide evidence for interaction 
between these two elements. In conclusion, Hermes is 
one of the first functional non-drosophilid transposable 
elements to be isolated and harnessed as a gene vector. 
Based on its mobility properties it will certainly be useful 
as a germline transformation vector in some species and 
continued testing of the host range of existing hAT 
elements and the continued search for additional mem- 
bers of this family of elements in other insect taxa 
seem warranted, 

THE mariner ELEMENT FAMILY 

Description. 

The mariner element was first isolated from the spon- 
taneously arising whitepeach (wpch) mutant of D. mauri- 
liana (Hayraer and Marsh, 1986; Jacobson and Hartl, 
1985). marineriW"*) is 1286 bp in length with a single 
long ORF, theoretically encoding a peptide of 346 amino 
acids. The element has 28 bp inverted terminal repeats 
containing four mismatches and creates a 2 bp TA repeat 
at its insertion site, mariner is present in 20-30 copies in 
the D. mauritiana genome but is absent from the sibling 
species, D. melanogaster (Jacobson et aL, 1986). The 
mariner element isolated from **** despite the presence 
of an open reading frame does not encode a functional 
transposase. An autonomous mariner element, referred 
to as Mas, was identified and isolated from D. mauritiana 
and is capable of mobilizing other mariner elements in 
trans, Mas is identical to the mariner (h^*) element 
except for six amino acid differences in the putative 
transposase coding region (Medhora et aL, 1988). 

Mobility in the host 

mariner elements can be highly mobile in D. mauriti- 
ana and have been used to generate mutations in the 
white and yellow genes of this species (Bryan et aL, 
1990). When the autonomous Mas element was intro- 
duced into a strain containing many non-autonomous 
mariner elements visible mutations were recovered at a 
frequency of approximately 1 per 4000 X -chromosomes 
screened. Of seven mutant alleles recovered from this 
screen and examined at the molecular level, all contained 
an insertion of mariner. Mobility of mariner in its host, 
D. mauritiana, is not confined to the germline, unlike the 
movement of F- and hobo elements in D. melanogaster. 

Mobility in non-hosts 

The mariner and Mos elements are capable of transpo- 
sition in D. simulans and D. melanogaster. The introduc- 
tion of the marineriW** 1 ) element into D. simulans by 
interspecific crosses between D, mauritiana and D. simu- 
lans resulted in the detection, isolation and characteriz- 
ation of autonomous mariner elements from D. simulans 
(Capy et aL 1990; Maruyama and Hartl, 1991c). This 



"bioassa/* approach for detecting functional mariner 
elements in other species is similar, in principle, to the 
bioassay approach that has been used for detecting func- 
tional hobo~l)ke elements in non-drosophilid insects 
(Atkinson et aL, 1993; Handler and Gomez, 1995). 

The mobility of the mariner element in non-host 
insects has been demonstrated in a variety of ways. For 
example, mariner has been used as a germline transform- 
ation vector of D. melanogaster following the P-element 
paradigm. Frequencies of Afos-mediated transformation 
varied from approximately 4 to 16% (Garza et aL, 1991; 
Lidholm et aL, 1993), however, these frequencies 
dropped considerably when additional DNA was inserted 
into the mariner element. When a 13.2 kb mariner 
element containing 1 1.9 kb of DNA containing the white 
gene {w+) sequence was used as a transformation vector 
Lidholm et at. (1993) recovered only two w* trans- 
formants after testing 271 gennlines. This rate of trans- 
formation is approximately 10-fold lower than thai 
observed using /'-elements under similar conditions. 
When the mariner vector was reconstructed with a mini- 
white gene of only 5.8 kb the transformation frequency 
did not improve, suggesting that mariner is relatively 
intolerant of the insertion of kilobases of additional 
sequences (Lohe et aL, 1995). Furthermore, the inte- 
grated mariner elements were somatically stable in D. 
melanogaster, despite subsequent introductions of vari- 
ous forms of the Mos element (expressing functional 
transposase) into these transgenic lines few progeny had 
evidence of somatic excision of the integrated mariner 
vector (Lidholm et aL> 1993; Lohe et aL, 1995). When 
attempts to remobilize the integrated mariner-w* element 
were made no evidence of transposition was reported. 
The dramatic reduction in transposition frequency of 
mariner element vectors containing kilobases of 
additional DNA together with their subsequent stability 
after integration has led to speculation that mariner vec- 
tors may have a strict limit to the amount of additional 
DNA they can carry (Lidholm et aL, 1993). 

mariner is also capable of excision and transposition 
in non-drosophilid insects. Using plasmid-based assays 
(Fig. I ) Coates et aL ( 1995) recovered mariner excision- 
reaction products in the calliphorid. L cuprina, and the 
tephritid B. tryoni. Excision-reaction products recovered 
from these species were identical in structure to those 
obtained using the same excision assay system in D. mel- 
anogaster and D. mauritiana. More recently these inves- 
tigators used plasmid-based mariner transposition assays 
to assess mariner mobility in non-drosophilid insects. 
mariner was capable of accurate interplasmid transpo- 
sition in developing embryos of both D. melanogaster 
and L cuprina (Coates C, Turoey C» Frommer M., 
O'Brochta D. and Atkinson P., unpublished results.) 
Only sequences precisely delimited by the terminal 
inverted repeats transposed and all integrations occurred 
into aTA dinucleotide target site. The frequency of trans- 
position in D. melanogaster was similar to that seen 
using hobo and Hermes in this species under similar con- 
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ditions but the frequency of mariner transposition in L 
cuprina was somewhat lower. These data demonstrate 
that mariner vectors containing up to 2.2 kb of additional 
DNA retain their mobility properties under ihese con- 
ditions suggesting they will be useful as gene vectors in 
insects if the amount of inserted DNA is minimized. 

More recently the transposase protein from the Himar 
I element (a mariner element from Haematobia irritans) 
was purified and shown to promote transpositional 
recombination in vitro in the absence of accessory pro- 
teins (I-ampe D. and Robertson H., personal 
communication). These experiments elegantly demon- 
strate the autonomous nature of the Himar I transposase 
and suggest that Himar I will be capable of transposition 
in a wide variety of organisms. They also provide the 
opportunity to develop a germline transformation vccior 
system which involves the direct injection of the mariner 
transposase protein together with the Himar /-derived 
vector into developing insect embryonic germ cells. This 
will eliminate the need to transiently express the transpo- 
sase gene in divergent insect hosts and may permit higher 
levels of transposase activity to be introduced into the 
embryo resulting in increased integration rates. 

Related elements in other insects 

Earlier efforts to determine the distribution of mariner 
elements revealed their presence in several drosophilid 
species, including D. simutans, D. sechellia, D. teissieri, 
FX yakuha and Zaprionus tuherculatus (Capy et aL 1991 ; 
Maruyama and Hard, 1991a, b). These elements were 
97% identical and this uncommon homology among such 
divergent species together with the discontinuous phylo- 
genetic distribution of these elements within this family 
of insects led to speculation that mariner has been trans- 
ferred horizontally between species. The distribution of 
mariner-Yike elements was shown to be widespread after 
the serendipitous discovery by Lidholm et ai (1991) of 
a mariner-like clement in the intron of the preprocecropin 
A gene of the Cecropia moth, Hyalophora cecropia. 
Using the sequence information from these highly 
diverged but related elements, Robertson (1993) used a 
PCR-based strategy to surveyed a large number of arthro- 
pod and non-arthropod species for the presence of 
additional mariner-family members. He discovered tnari- 
/ier-like elements in every major taxu in which he 
sampled, including insects. These data have recently been 
reviewed in some detail elsewhere (Robertson and 
Lampe, 1995). Based on this large data set numerous 
examples were found where the degree of sequence 
identity between mariner-like elements was far greater 
than expected if the elements were being transmitted 
solely by vertical transmission. Consequently, there 
appear to be numerous instances of horizontal transfer of 
mariner-like elements between highly divergent phylog- 
entic groups. The mobility properties of mariner as 
determined experimentally and discussed above are con- 
sistent with the proposed ability of mariner to be hori- 
zontally transferred between species. 



THE Tel ELEMENT FAMILY 

Description 

The Tel clement is a short inverted repcat-type 
element originally identified and isolated from the nema- 
tode Caenorhabditis elegans (Emmons et aL 1983). 
Related elements are found in a number of other organ- 
isms including insects. Four 7W-likc elements have been 
isolated from Drosophila species; HB and Baril from D 
meianogaster (Caizzi et ai. 1993; Harris ei at., 1988). 
Uhu from D. heteroneura (Brezinsky et aL, 1990) and 
Minos from IX hydei (Franz and Savakis, 1991 ). TW-like 
elements share similarities in their transposase coding 
regions and their inverted terminal repeats. 7c/' -like 
elements also share limited similarities with mariner 
elements. Members of these iwo families of elements 
share approximately 19-20% amino acid identity, create 
TA dinucleotide repeats upon insertion into target DNA 
and may be derived from a common ancestral element. 
The reader is referred to the excellent review by Robert- 
son (1995) for a comprehensive analysis of the simi- 
larities between Tel and mariner elements. For the pur- 
poses of this mini-review and the construction of gene 
vectors the 7 W -like element from D. hydei, called Minos, 
is currently of the most interest. 

Minos was originally identified as a dispersed repeti- 
tive sequence within the transcribed spacer of a rRNA 
gene of D. hydei (Franz and Savakis. 1991). Minos is 
1 .775 kb in length, has inverted terminal repeats of 255 
bp and two long non-overlapping ORFs the longest of 
which shows 32% amino acid identity with the Tel trans- 
posase. The two open reading frames are part of the same 
transcription unit and result in a spliced product. Minos 
is inserted at multiple locations within the genome of D. 
hydei and different strains of this species display inser- 
tion site polymorphisms (Franz et at., 1994). 

Mobility in host species 

Tel is highly mobile in C elegans, while Minos 
appears to be the only known active 7c/-like element 
from insects. Minos mobility was originally inferred from 
the presence of insertion site polymorphisms between 
very closely related strains of D. hydei (Fran/ et aL 
1994). Direct tests of Minos transposition in /J. hydei 
have not been reported. 

Mobility in non-host species 

Minos is also mobile in non-host species of insects. 
Following the /'-element paradigm a Minos vector con- 
taining the mini-white gene of D. melanngaster was con- 
structed and injected into IX melanogaster embryos 
expressing Minos transposase (from a previously incor- 
porated transgene) or into IX melanogaster embryos with 
a plasmid containing the Minos transposase gene under 
the regulatory control of the D. melano^aster hsp 70 pro- 
moter (Loukeris et aL 1995a). Both strategies resulted in 
transformants at a frequency of approximately 2%. Four 
transgenic lines were examined for the presence of inte- 
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grated Minos elements and. in all eases, transposition of 
Minos into chromosomes was confirmed. Integrated 
Minos elements were delimited by the inverted terminal 
repeals and created TA dinucleotide duplications at the 
integration site (Loukeris et ai, 1995a). Minos therefore 
is a functional transformation vector for D. melunogaster 
although it is currently less efficient than hobo- and 
f femes-based vectors. Integrated Minos elements appear 
stable in the absence of .V/iwA\-cncodcd transposasc 
despite the presence of related elements in this species. 

Recently, Minos was successfully used as a germline 
transformation vector in the Medfly. C capitata. a result 
that represents a major advance in the efforts to creare 
technologies for the production of transgenic nondroso- 
philid insects (Loukeris et ai, 1995b). A Minos vector 
was constructed by replacing the transposasc transcrip- 
tion unit with the cDNA of the white gene from the 
Medlly and under the regulatory control of the I\sp70 pro- 
moter of D. melunogaster. The Minos vector and transpo- 
sasc helper plasmids were co-injected into preblastoderm 
Medfly embryos homozygous for a recessive mutation in 
ihe white locus. 1.3% of the fertile G 0 adults developing 
from injected embryos produced at least one progeny 
with pigmented eyes from which lines were established. 
Vector integration was confirmed by in situ hybridization 
analysis of polytene chromosomes and Southern blot 
analysis of genomic DNA prepared from transformed 
individuals using Minos as a probe. The precise junction 
between the Minos vector and chromosomal DNA was 
not determined but, based on Southern blot analysis, inte- 
gration resulted from transpositional recombination 
involving only the Minos vector. All transgenic lines ana- 
lysed appeared to contain a single integrated Minos vec- 
tor. The stability of the Minos vector within these trans- 
genic insects is currently being investigated. 

THE TTAA-SPECIFIC TRANSPOSABLE ELEMENT 
FAMILY 

Description 

TTAA-specific transposable elements are a diverse 
group of short inverted repeat-type elements that were 
identified as insertion sequences in baculoviruses that 
infect lepidoptera. These sequences were subsequently 
found within the genomes of the lcpidoptera from which 
the haculoviruses were isolated (Beames and Summers, 
1990; Carstens, 1987; Frascr iff «/., 1985; Schctter et aL y 
1990; Wang et ai. 1989). Information concerning the 
structure and biology of these elements is not extensive 
but they appear to be a diverse group of elements that 
share two features. First, the 5' end of their terminal 
inverted repeats end in two or three C residues. Second, 
these elements are always found inserted in a TTAA tar- 
get site. Beyond these similarities the elements have little 
in common. Their terminal inverted repeats are 13-15 
nucleotides and their overall length can vary from 2.5 kb 
ipig&yBac or IFP2) (Gary et at., 1 989; Fraser et aL, 1 983) 
to 780 bp Uagalong or TFP2) (Fraser et «/.. 1983). The 



piggyBac element, isolated originally as an insertion 
sequence in the Galleria mellonella NPV (GmMNPV) 
following passage in the Trichoplusia ni -derived cell line 
TN-368, encodes a single transcript of 2.1 kb in length 
(Fraser et ai % 1995). These elements currently show no 
similarities to other families of transposable elements and 
their distribution outside lepidoptera is unknown. 

Mobility 

TTAA-specific elements were discovered by virtue of 
their insertional mutagenic properties and consequently 
appear to be active transposable elements. Mone of the 
elements isolated, except the piggyBac (IFP2) element 
appears to be an autonomous element, piggy Bac is 
mobile in cell lines derived from T. ni and Spodoptera 
frugiperda. and recent evidence demonstrates that it is 
capable of acting as a vector i.e. transpose while carrying 
non-clement DNA sequences. Using methods that 
resembled those described by CTBrochta et ai (1994), 
Fraser et aL (1995) assessed the ability of piggyBac to 
transpose in S. frugiperda cells. Frascr et ai ( 1995) con- 
structed a piggyBac clement marked with the 0-galacto- 
sidase alpha peptide ORF and together with an intact pig- 
gyBac clement and a wildtypc baculovirus (AcMNPV) 
co-transfected S. frugiperda cells. Baculoviruses were 
subsequently recovered from infected cells that had the 
modified piggyBac element inserted into a non-essential 
region of the viral genome. Integration clearly resulted 
from transpositional recombination since only sequences 
delimited by the terminal inverted repeats of piggyBac 
transposed and all integration events created TTAA 
insertion-site duplications. Significantly transposition of 
piggyBac in these experiments required co-transfection 
of an intact piggyBac element suggesting that an 
element-encoded transposase was required. Tt is difficult 
to compare the rates of transposition in these experiments 
with those observed using hobo or Hertnes because of 
the difference in experimental design. Nevertheless, these 
results are important because they provide evidence for 
a functional transposable element system from a family 
of lepidoptera (Noctuidae) of tremendous economic 
importance and represents a major advance in the efforts 
to develop non-drosophilid insect gene vectors. The 
components for a functional lepidopteran gene vector 
now appear to be available and tests of the practical util- 
iLy of this system as measured by mobility in vivo will 
be forthcoming. 

TTAA-specific elements appear to be present in a 
number of lepidopteran taxa and may be widespread 
within this order. For example, genetically marked pig- 
gyBac elements residing on baculoviruses or plasmids are 
capable of excising after introduction into lepidoptera 
cell lines that do not contain pi££yBac-homologous 
sequences (M. Fraser, personal communication). Exci- 
sion under these conditions resembles excision of Iwbo 
in M. domestica in the absence of Zio/w-encoded transpo- 
sase (Atkinson et aL, 1993). There is a growing body of 
evidence indicating that cross- mobilization is due to the 
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presence of related and functional TTAA -specific 
elements in the species from which the cell lines orig- 
inated. 

LIMITS OF THE P-ELEMENT PARADIGM 

Progress towards the development and use of trans- 
formation vectors using mariner 7W-like. ITAA and 
HAT elements affirms the utility of short inverted repeat- 
type elements and the general applicability of what we 
have referred to as the /'-element paradigm to the prob- 
lem of insect transformation. Despite the encouraging 
developments there will be limits or conditions under 
which the F-element paradigm will serve us poorly. 

First, the P-element paradigm involves the use of spec- 
ialized host strains of Drosophila as recipients of the 
transgene. Drosophila strains arc used thai do not possess 
full-length, functional copies of the transposable 
elements employed as transformation vectors in that 
species. The existence of strains devoid of hobo mid 
mariner elements has permitted the use of these elements 
as vectors in a genetic environment free of element-spe- 
cific regulatory mechanisms that are responsible for 
repressing element movement. These element-specific 
repression systems can, however, he overcome under 
some conditions such as over-expressing functional 
transposase (Sleller and Pirrotta, 1 986). The presence of 
endogenous functional transposable elements of the type 
used as a vector may also compromise transgene stability 
and make maintenance of a stable transgenic line imposs- 
ible. Although ii is impossible to determine if strains 
completely devoid of elemenrs from which the vector 
was constructed are absolutely required for the /'-element 
paradigm to be effective in non-drosophilids. the pres- 
ence of related elements may prove equally problematic. 
Strains of insects may be found that are devoid of the 
specific element being used as a gene vector but the pres- 
ence of related elements may compromise the utility of 
the vector because of cross-mobilization. Cross-mobiliz- 
ation refers to the interaction of related but non-identical 
transposable elements and has been reported to occur 
between members of the hAT and TTAA families of 
elements. While strains of insects analogous to M [P- 
free) or E (hobo-free) strains of D. melano^aater would 
be desirable it may be possible to develop functional vec- 
tor systems without them by designing "suicide systems" 
that permit the newly integrated vector to be disabled 
after integration. Vector disabling systems may he con- 
structed that are analogous to the self-eviscerating sys- 
tems used to remove antibiotic resistance genes from 
integrated plant gene vectors (Dale and Ow, 1 99 1 ) and to 
create inducible "knockout" mice (Shastry, 1995). These 
systems depend upon the strategic placement of site-spe- 
cific recombination signal sequences flanking the regions 
to be excised from the vector. Expression of the appropri- 
ate recombinase results in excision of the sequences 
between the recombination signals. By locating, for 
example, the recombination signal sequences for FLP 



recombinase. a highly active site-sped he recombination 
system from yeast, in vector sequences essential for 
transposition and expressing FLP recomhinase, the 
removal of essential vector sequences and permanent 
immobilization of the vector will result. It is premature 
to say if element-specific regulatory mechanisms will 
become a major obstacle lo the development of transpos- 
able element-mediated transformation technology in nbn- 
drosophilid insects but its potential importance should 
be recognized. 

The second feature of the P-element paradigm that 
may be difficult to satisfy in many insects unci arthropods 
is the strategy employed lo deliver I he vector lo I he cells 
of interest. Within the /'-element paradigm vector 
sequences are delivered directly to developing germcells 
by introducing DNA into the pole plasm of preblasto- 
derm embryos. This has been usually done by injecting 
the vector but biolislics has also been used (Baldarelli 
and Lengyel. 1990). Unfortunately direct introduction of 
vector DNA into the pole plasm of preblastoderm 
embryos will not be possible for some insects because 
of unique qualities of the egg such as impenetrable chor- 
ions or because of the unique reproductive physiology of 
the insect. For example, viviparous insects te.g. tsetse 
flies) will probably not be amenable lo transformation 
following the P-elemenl paradigm. Insects or arthroptxis 
that are minute or that have minute eggs may also he 
excluded from the strategies outlined in this review 
Transforming these insect may require completely differ- 
ent strategies such as the use of viruses that can facilitate 
the delivery and entry of the DNA into a target cell ami 
nucleus (Yee ci «/., 1994). Direct injection of DNA into 
the gonads or hemocoel may be possible alternatives for 
transforming some species (Presnail and Hoy. 1992). 

The /'-element paradigm is clearly a powerful model 
upon which to base non-drosophilid insect transformation 
technology but its limits will be realized as it is applied 
to diverse insects. Other strategies will be required. 
Efforts to develop alternatives to the ^-element paradigm 
are of great importance although not discussed in this 
review. The concept of a universal insect vector was 
never an entirely reasonable idea and even with the dis- 
covery of elements with broad host ranges it is unlikely 
that an element will be found that works in all species. 
Fortunately, there has been significant progress in the dis- 
covery and testing of transposable elements amenable to 
vector development and there arc now multiple vector 
systems that can be anticipated (Table 1 ). It is now clear 
that the technical bottleneck created by the absence of 
non-drosophilid gene vectors has been breached and the 
widespread use of this technology by insect scientists 
will be forthcoming. 
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Abstract 

In all the organisms, homologous recombination (HR) is involved in fundamental processes such as genome diversifica- 
tion and DNA repair. Several strategies can be devised to measure homologous recombination in mammaliun cells. We 
present here the interest of using intrachromosomal tandem repeat sequences to measure HR in mammalian cells and we 
discuss the differences with the ectopic plasmids recombination. The present review focuses on the molecular mechanisms of 
HR between tandem repeats in mammalian cells. The possibility to use two different orientations of tandem repeats (direct or 
inverted repeats) in parallel constitutes also an advantage. While inverted repeats measure only events arising by strand 
exchange (gene conversion and crossing over), direct repeats monitor strand exchange events and also non-conservative 
processes such as single strand annealing or replication slippage. In yeast, these processes depend on different pathways, 
most of them also existing in mammalian cells. These data permit to devise substrates adapted to specific questions about 
HR in mammalian cells. The effect of substrate structures (heterologies, insertions /deletions, GT repeats, transcription) and 
consequences of DNA double strand breaks induced by ionizing radiation or endonuclease (especially the rare-cutting 
endonuclease ISce-I) on HR are discussed. Finally, transgenic mouse models using tandem repeats are briefly presented. 
© 1999 Elsevier Science B.V. All rights reserved. 

Keywords; Homologous recombination; Mammalian cells; Tandem repeat sequences 



1. Introduction 

Homologous recombination (HR) plays a central 
role in various fundamental processes determining 
genome organization and rearrangement such as 
molecular evolution [1,2] mating type switching in 
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yeast (for review, see Ref. [3]), antigenic variation of 
the trypanosome (for review, see Ref. [4]), or for 
diversification of immunoglobulin genes in chicken 
[5] or in rabbit [6], faithful chromosomal segregation 
during meiosis in yeast (for review, Refs. [7,8]), and 
DNA repair (for review, see Ref. [9]). 

On one hand, HR is involved in the maintenance 
of genome integrity by repairing damaged DNA; on 
the other hand, it may also contribute to genome 
instability since recombination between homologous 
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repeated sequences dispersed through the genome 
may lead to duplications, inversions, deletions, 
translocations [10]. Moreover, gene conversion with 
pseudogenes can inactivate a functional allele [11], 
Finally, several lines of evidences connect HR with 
cancer predisposition or prevention. First, carcino- 
gens stimulate intrachromosomal HR [12-15], Sec- 
ond, intrachromosomal HR is elevated in DNA re- 
pair deficient cell lines, in Ataxia telangiectasia cell 
lines or p53 defective cells lines, all of these pheno- 
types being associated with cancer predisposition 
[16-20]. Third, RadSl, a putative recombination pro- 
tein in mammalian cells, has been shown to interact 
with the products of the tumor suppressor genes 
p53, BRCA1, BRCA2 [21-25]. 

The use of intrachromosomal tandem repeat rep- 
resents a good strategy to measure HR in mam- 
malian cells and has been often used because (i) it is 
the easiest way to introduce the two partners of 
recombination in a broad variety of cells and in one 
round of cell transfection, (ii) these systems permit 
efficient measure of HR and (iii) it allows to select 
for gene conversion or crossover and deletion events; 
(iv) depending on the orientation of the two markers, 
it can measure recombination arising by different 
mechanisms such as strand exchange (SE), single 
strand annealing (SSA) or replication slippage [26]. 
These mechanisms involving different pathways in 
yeast, it is thus essential to be able to distinguish 
between them in mammalian cells also. Here we 
present mechanisms of HR deduced from the knowl- 
edge in yeast, and also some molecular character- 
istics of HR between tandem repeat in mammalian 
cells. These data aid in the design of substrates 
adapted to more specific questions about the regula- 
tion of HR in mammalian cells. 



J.J. Mechanisms and pathways of homologous re- 
combination between tandem repeat: the knowledge 
from Saccharomyces cerevisiae 

Two different orientations of the tandem repeat 
recombination substrate can be used: direct repeat 
(DR) or inverted repeat (IR). In bacteria and yeast, 
RecA/Rad5 1 protein promotes strand exchange (SE). 
SE recombination can act on DR as well as on IR 



and lead either to gene conversion or to crossing 
over (Fig. 1). Gene conversion between both DR and 
IR leaves intact the general structure of the locus. 
Depending on the orientation of the substrates, cross- 
ing over results in different structures: crossing over 
between IR leads to the inversion of the intervening 
sequence (Fig. IB, Fig. 2A); crossing over between 
DR leads to the deletion of the intervening sequence 
either by an unequal sister chromatid exchange or by 
an intrachromatid exchange (Fig. 2B,C). In addition 
to SE recombination, direct repeats (DR) permit also 
to monitor deletion events arising by RAD5J inde- 
pendent mechanisms such as single strand annealing 
(SSA) (Fig. 3), replication slippage and other mecha- 
nisms [26]. Because of their orientation, inverted 
repeats measure neither SSA nor replication slip- 
page. 

Extensive studies in yeast have determined genes 
involved in the control of HR. SE and SSA represent 
the main pathways using homologous sequences to 
achieve double strand break (DSB) repair. Besides 
DSB repair, replication slippage involved pathways 
different from those required for SE and SSA [26]. In 
Saccharomyces cerevisiae % the SE mechanism is de- 
pendent on the RAD52 epistasis group including 
RAD51, RAD52, RAD54, RAD55, RAD57 genes, 
whereas SSA does not involve these genes, SSA is 
dependent upon SRS2, on the nucleotide excision 
repair proteins Radl and RadlO and on the mismatch 
repair proteins Msh2 and Msh3 [27,28]. Homologues 
to most of these genes have been described in mam- 
malian cells (Table 1). Clearly, the study of HR in 
mammalian cells would benefit from the comple- 
mentary use of the two kinds of substrates (DR and 
IR). 



J.2. Molecular events of intrachromosomal recom- 
bination between duplicated sequences in mam- 
malian cells 

Two kinds of substrates have been used. One used 
two LAC Z sequences, recombinant cells being de- 
tected by the blue coloration. The second type of 
substrates used selectable genes (herpes simplex virus 
TK gene in tk~ cells, neomycin resistance gene, 
hygromycin resistance gene...), recombinant cells 
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forming clones when the selection was applied. Ad- 
ditionally, DSB repair has been studied following in 
vivo endonucleases treatment 

1.2.1. Spontaneous intrachromosomal recombination 
is a conservative mechanism 

Analysis of the structure of recombinants has 
revealed that spontaneous intrachromosomal recom- 
bination is mainly a conservative mechanism in 



mouse L-cells with gene conversion representing ap- 
proximately 80% of the recombination events [29 J. 
This was observed using direct repeat or inverted 
repeat sequences [30]. The majority of the events 
riving rise to crossover products involved unequally 
paired of sister chromatids after DNA replication 
(see Fig 2) [31]. These results contrast with those or 
extrachromosomal recombination which follow a 
non-conservative mechanism in mouse L-cells [32- 
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Fig. 2. Models for crossitig-over events after SE. (A) Crossing over between two inverted repeats leads to the inversion of the intervening 
sequence. The arrows indicate the orientation of the two copies, (x) symbolizes the exchange and the numbers are here to orient the 
intergenic sequence. (B) and (C) recombination between two direct repeats (hatched boxes). The arrows indicate the orientation of the two 
copies. The lines represent the duplex DNA. (B) Recombination between two raispaircd chromatids. (C) Intrachromatid recombination. The 
two copies (hatched boxes) are paired. After the exchange ( X ), the resolution of the Holliday junction can lead to gene conversion (see Fig. 
I) or to a crossing over event In (B), the crossing over leads to the formation of a unique active gene and to the deletion of the intergenic 
sequence on one chromatid and to a triplication on the other chromatid. In (C), the crossing over leads to the maintenance of only one copy 
on the chromosome and one loop that can be eliminated. In some cases, the loop can be integrated elsewhere in the genome. (D) Crossover 
between two truncated copies. The black box represents the overlapping sequence. (A, C, D) correspond to reciprocal intrachromatid 
exchanges. 



34]. It cannot be excluded that these ratios may differ 
in other cell types. 

1.22. Heteroduplex formation 

The different models for HR involved intermedin 
ate structures in which DNA-strand exchanges create 
hybrids and heteroduplex DNA between the two 
recombining molecules [35-37]. Biochemical studies 
of mitotically dividing mammalian cells indicate that 



exchange between sister chromatids involved inter- 
strand transfer of DNA [38,39]. In addition, het- 
eroduplex formation during HR promoted by human 
nuclear extracts was observed in a cell free system 
[40], The analysis of the intrachromosomal recom- 
bination products between tandem repeats has shown 
that recombinant colonies result from unrepaired het- 
eroduplex DNA [31,41], Since particular structures 
may affect the formation, the stability or the resolu- 
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4) , 

Fig. 3. The single strand annealing model [32], The two comple- 
mentary strands of the duplex DNA are drawn here. The heavy 
lines symbolized the homologous repeat sequences and the arrows 
the orientation of the two repeats, (l) A double strand break can 
occur even in the non-homologous intervening sequence. (2) After 
degradation by a single strand exonuclease, single stranded tails 
(ss-tails) are created. If the repeats are in a direct orientation, 
complementary ss-tails are created, (3) Annealing of the two 
complementary single stranded regions results in a structure lead- 
ing to the deletion of the intervening sequence after the resolution 
of this structure. (4) SSA is exclusively a non conservative 
mechanism. If the repeats are in an inverted orientation, the 
ss-tails (step 2) are identical but not complementary and SSA 
cannot act. 

tion of the heteroduplex recombination intermedi- 
ates, the effect on recombination of the DNA struc- 



ture (insertions or deletions leading to loops forma- 
tion into the heteroduplex molecule the homology 
requirements . . . ) have been studied. 



L2.3. Effect of the sequence structure: insertions, 
deletions, sequence homology requirements, mi- 
crosateUites, transcription 

1.2.3. 1. Insertions and deletions. The molecular na- 
ture of insertion or deletion mutations (correspond- 
ing to the black boxes in Figs. 1 and 2) in the copies 
of the duplication can influence the efficiency of HR. 
DNA strand exchange is able to propagate through 
heterologous sequences forming heteroduplex DNA 
bearing loops. The resolution of such intermediates 
leads more frequently to the excision of the loop 
with an efficiency correlated to the size of the loop 
[42]. 

7.2.3.2. Homology requirements. Two alternative 
strategies can be drawn to address the question of 
how much homology is required for HR in mam- 
malian cells. The first uses two truncated molecules 
with different lengths of overlapping of uninter- 
rupted homology (Fig. 2D). When the two interact- 
ing molecules share length of homologies between 
295 bp and 1,8 kb, the rate of gene conversion is 



Table 1 

Comparison between strand exchange (SE) and single strand annealing (SSA) 
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Orientation of the 
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Strand exchange 


Gene conversion 
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HsRADSr 
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RAD52 b 


HsRAD52 






RAD54 


HsRAD54 




Crossing over 




RAD55* 
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Deletions 


Direct repeat 
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Single strand annealing 
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Direct repeat 


RAD I 


XP-F C 








RADIO 


ERCC1 








MSH2 


HsMSH2 








MSH3 


HsMSH3 








SRS2 





ft In yeast RAD55 and RAD57 share homologies with RAD51. In mammalian cells, beside HsRAD5! y at least 6 other RAD51 homologues 
have been described: XRCC2, XRCC3, RADS1B /HREC2, RAD51H3, RAD51Q RAD51D, Some of these homologues presumably 
correspond to RAD55 and to RAD57. 

b Defines the epistasis group for homologous recombination in yeast. 
c Xeroderma pigmentosum group F, 
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directly proportional to the length of uninterrupted 
homology. This rate is reduced 7 fold with 200 bp 
homology and 100 fold with 95 bp length of homol- 
ogy [43]. 

The second strategy uses two molecules of ap- 
proximately the same size but containing sequence 
polymorphism. Waldman and Liskay used a HSV-ffc 
duplication in which one TK copy came from HSV 
type 1 and the other copy from HSV type2 (homeol- 
ogous recombination). These TK genes share 81% of 
homology. The authors observed that with 19% di- 
vergence, the rate of intrachromosomal HR was re- 
duced 1000 fold relative to the rate of HR between 
two identical HSVl-fjfc sequences. In contrast, the 
rate of intramolecular or intennolecular extrachro- 
mosomal recombination was only reduced by a fac- 
tor 3 to 15 [44]. These results also argue in favor of 
distinct mechanisms between extrachromosomat and 
intrachromosomal recombination. Moreover, if effi- 
cient intrachromosomal recombination required a 
minimum of 134 to 232 bp of uninterrupted homol- 
ogy, a single-nucleotide heterology in this minimal 
region of homology was sufficient to inhibit efficient 
recombination [45]. In addition, when recombination 
initiates in a perfectly homologous sequence, it is 
able in a second step to propagate through on adja- 
cent sequences exhibiting 19% heterologies [45]. Fi- 
nally, Yang and Waldman [46] show that gene con- 
version involved transfer of uninterrupted blocks of 
information from 35 to more than 330 bp. Taken 
together, these results are consistent with the notion 
that more than 200 bp of homology are required to 
initiate efficient gene conversion in mammalian cells. 
This has also been found in in vitro reactions in cell 
free systems [47]. 

1233. Effect of GT repeats. A variety of DNA 
sequences may play direct or indirect roles in recom- 
bination by their effects on the DNA structures. It 
was proposed that GT and GC repeats, which can 
form Z-DNA, may influence recombination [48]. It 
has been shown that GT, GC repeats and minisatel- 
lite repeats, stimulated extrachiomosomal recombina- 
tion of transfected DNA [49,50], However, using an 
intrachromosomal assay, a {GT) 29 repeat has been 
shown to be unable to stimulate recombination and 
to influence the distribution of the recombination 
events [51]. 



L23A. Transcription. Transcription stimulates ho- 
mologous recombination in & cerevisiae [52,53]. 
Stimulation by transcription of extrachromosomal 
recombination [54] and intrachromosomal recom- 
bination [55] has been reported in CHO cells. Alleles 
transcribed at high levels recombined about 2 to 
7-fold more frequently than identical alleles tran- 
scribed at low level. In line with this, preferential 
repair of UV damage has been shown to abolish the 
transcription-stimulation of HR [56]. Finally, tran- 
scription has no effect on recombination induced by 
a DNA double strand break [57]. 

13. Stimulation of homologous recombination by 
DNA double strand breaks 

In different species, DNA-damaging agents were 
shown to stimulate homologous recombination. DNA 
double strand breaks (DSB) are the main lesion 
involved in the stimulation of homologous recom- 
bination in different organisms. It is used to initiate 
recombination during meiosis in yeast S. cerevisiae 
(for review, see Refs. [7,8]). The DSB is also one of 
the most genotoxic lesion induced by ionizing radia- 
tion and can be repaired by two general mechanisms: 

(i) non homologous end joining (NHEJ), which does 
not necessarily involve sequence homologies and 
promote the end-joining of the broken extremities; 

(ii) homology-directed repair which involves homol- 
ogous sequences and corresponds to SE and SSA 
mechanisms. Ionizing radiation stimulates interallelic 
recombination in the endogenous TK locus in a 
human lymphoblast cell line [58]. However, the ef- 
fect of ionizing radiation on intrachromosomal HR 
between tandem repeat sequences varies according to 
the cells and/or to the substrates used. Although, 
HR between two LacZ sequences in CV-l cells is 
stimulated by ionizing radiation [59], it does not 
stimulate HR between HSV-TK sequences in mouse 
L-cells [12]. The fact that the cell lines used are 
different could explain this result. Another explana- 
tion could involved the differences in the substrates, 
LacZ vs. TK sequences. In the former case, recombi- 
nant are scored by coloration (0-galactosidase activ- 
ity) some hours following the treatment. In contrast 
TK"*" recombination events can be scored on surviv- 
ing colonies several weeks after the treatment. Thus, 
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in the former system, recombination events can be 
scored before the death of the cells generally arising 
several days after exposure to radiation. In contrast, 
the second system scores only surviving colonies. 

Another way to produce DSB in target DNA uses 
nucleases. Restriction enzymes corresponding to sin- 
gle sites present in the target were electroporated 
into CHO cells or human cells [42,60]. In these 
cases, recombination was increased by more than 
1 0-fold. Electroporation of the rare cutting yeast 
endonuclease Pl-Scel also stimulated recombination 
provided there was a corresponding cleavage site in 
the duplication [60]. The yeast endonuclease I-Scel 
has provided a useful tool to study targeted DSB in 
mammalian cells, as already reviewed [61]. I-Scel 
recognizes a cleavage site of 18 bp long. Due to this 
large restriction sequence, there is probably no I-Scel 
site in the mammalian genome and expression of the 
I-Scel enzyme in mammalian cells is not toxic. 
However, when the I-Scel restriction site is present 
in the duplication, at least 80% of the molecules are 
cleaved after transfection of a plasmid expressing 
I-Scel enzyme [62]. Induction of a site-directed DSB 
into the recombination substrates strongly stimulates 
both homologous and non-homologous recombina- 
tion [63,64,57]. HR is stimulated 100 times but 
non-homologous recombination is stimulated 1000 
times [64]. However, using a physical analysis, ho- 
mology-directed repair of I-Scel-induced DSB's is 
found to account for 30-50% of the observed events 
[65]. ISce-I induced recombination produced mainly 
deletion events (80%), that are interpreted as a result 
from SSA. Finally, in contrast with spontaneous HR, 
transcription does not stimulate DSB-induced recom- 
bination [57], although we cannot exclude that the 
efficiency of DSB-stimulation is so great that it 
would mask the effect of transcription. 

1.4. Transgenic mice models 

The assay using duplicated repeats to measure 
intrachromosomal recombination can be envisioned 
in vivo in transgenic mice. In that case, recombina- 
tion can only be recorded using genes giving a 
coloration such as the (3-galactosidase gene, but not 
with a gene conferring resistance to a drug. Two 
models were developed to measure recombination in 



specific tissues. One model has been developed to 
analyze the lineage of cells in the myotome; thus, the 
expression of the recombination substrate was driven 
by a promoter which confers expression specifically 
to cells of this compartment: the promoter of the 
a-subunit of acetylcholine receptor. The descendants 
of the recombinant cells are histochemically identi- 
fied, permitting the analysis of the lineage of cells in 
the intact embryos [66]. It appears that the frequency 
of recombination in this tissue is similar to the 
frequency measured in cultured cells: between 1 and 
2X10" 6 [66]. 

Another transgenic model also uses a duplication 
of |3-galactosidase genes driven by a meiosis specific 
promoter. Germ line gene conversion was analyzed 
in transgenic male gametes. Spermatids which un- 
dergo intrachromosomal gene conversion produce 
functional (3-galactosidase (lacZ*), visualized by 
histochemical staining. Approximately 2% of the 
spermatids, produced by a combination of meiotic 
and mitotic conversion events, were LacZ 4 [67]. 



2. Conclusions 

Numerous differences exist between intrachromo- 
somal and extrachroruosomal HR. For example, tn- 
trachromosomic recombination is conservative whilst 
plasmid recombination is non-conservative in mouse 
L-cells; in addition heterologies and GT repeats dif- 
ferently affected plasmid and intrachromosomal re- 
combination. Moreover, the number of plasmid 
copies is not controlled and it is well established that 
plasmids are submitted to extensive nuclease attack 
after transfection. Chromatin structure and the nu- 
clear organization may also account for the differ- 
ences observed between plasmid and intrachromoso- 
mal recombination. Thus, we believe it is most suit- 
able to measure HR between two intrachromosomal 
tandem repeat sequences. 

The other possibilities to measure HR between 
two intrachromosomal sequences are interallelic and 
ectopic recombination, which are however techni- 
cally complicate to set up. Furthermore, interallelic 
HR as well as ectopic HR are rather inefficient in 
mammalian cells [68,42,69]. Finally, the possibility 
of using two different orientations of tandem repeats 
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(direct or inverted) and the use of the rare-cutting 
endonuclease ISce-I provide a set of complementary 
approaches to study HR in mammalian cells. 

Using these of strategies, connections between 
HR and other fundamental cellular processes have 
been described. Firstly, transcription stimulates HR 
but not DSB-induced HR [57]. Connections with cell 
cycle control and DNA repair have also been stud- 
ied: alteration of ATM or p53 functions lead to an 
increase of HR [18-20]; carcinogens and DNA dam- 
aging agents treatments stimulate both gene conver- 
sion and deletion [12]* More generally, the efficiency 
of repair of DNA damages diminishes the stimula- 
tion of recombination [17,56]; finally, inhibition of 
Poly(ADP-ribose)polymerase (PARP) by 3-methoxy- 
benzamide increases recombination between 3 and 
4-fold in mouse L-cells [70]. 

Very little is known on the genetic control of 
homologous recombination in mammalian cells. 
Overexpression of human RAD52 stimulates recom- 
bination between two lacZ direct repeats about 3 -fold, 
in monkey cells and confers resistance to ionizing 
radiation [59]. The Rad52 stimulation acts via the 
single stranded DNA binding protein RpA which is 
also involved in replication and DNA excision repair 
[71]. One question would be to understand how 
overexpression of only one component of a multipro- 
tein complex can stimulate the whole process, Le., 
homologous recombination. One explanation could 
be that Rad52 is the limiting factor of the complex. 
However, it has recentiy been shown that overex- 
pression of the Hamster Rad51 protein into CHO, 
also confers resistance to ionizing radiation and stim- 
ulates HR between direct repeat sequences [72]; this 
result is not in accordance with this hypothesis. The 
combined analysis of HR between direct repeats, 
between inverted repeats and of the DSB-induced 
HR should afford essential information to understand 
these process in mammalian cells. 
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